Training CNN for 3D image to image with CombinedDatastore

Question

Andrew Scott 2023-7-4

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1991563-training-cnn-for-3d-image-to-image-with-combineddatastore

评论： Andrew 2025-6-16

Hi,

I am trying to set up a neural network to denoise 3D grayscale images (image to image). I based the script on the cardiac MRI segmentation example:

https://uk.mathworks.com/help/deeplearning/ug/cardiac-left-ventricle-segmentation-from-cine-mri-images.html

and I was able to convert that to successfully denoise 2D images. For the 3D example, I used the cardiac data from the BraTS dataset and added noise a copy to create an input noisy dataset of size 320x320x130 with 20 volumes.

The network looks ok in analyzeNetwork, but:

>>trainnetwork(dsTrain, net3dlg, options);

fails with the error:

"Error using trainNetwork (line 184)

Invalid training data. Elements in column 1 returned by the input datastore must be 4-D numeric arrays"

When I run:

>>dsTrain.read

ans =

1x2 cell array

{320x320x130 double} {320x320x130 double}

and

This is seems to be the 3D equivalent of what I was using for the 2D network and I can't see why this fails as the 4th dimension is simple 1? I have tried repmating to give 320x320x130x3 matrices, but that gives the same error.

What does trainnetwork expect in this case?

Thanks in advance!

Andy

Full code:

forcecreatetraining=false;
trainingdatafolder='~/Documents/MATLAB/training_data/Task02_Heart/imagesTr/';
dataFoldernoisy='/~/Documents/noisydata/Task02_Heart/noisy_images';
zsize=130;  %EMPIRICALLY THE LARGEST SIZE OF THE Z DIMENION IN TRAINING.
%%
imageLayer = image3dInputLayer([320,320,zsize, 1],'Name', 'Image_input');
net3d=[imageLayer,...
    convolution3dLayer(3, 32, 'Name', 'Convolution-1-1', Padding='same'),...
    reluLayer('Name', "Encoder-ReLU1-1"),...
    convolution3dLayer(3, 32, 'Name', 'Convolution-1-2', Padding='same'),...
    reluLayer('Name', "Encoder-ReLU1-2"),...
    maxPooling3dLayer(2, 'Stride',[2 2 2],'Name', 'Max_pool1'),...
    convolution3dLayer(3, 64, 'Name', 'Convolution-2-1', Padding='same'),...
    reluLayer('Name', "Encoder-ReLU2-1"),...
    convolution3dLayer(3, 64, 'Name', 'Convolution-2-2', Padding='same'),...
    reluLayer('Name', "Encoder-ReLU2-2"),...
    dropoutLayer(0.5, 'Name', 'downward_dropout'),...
    maxPooling3dLayer(2, 'Stride',[2 2 2],'Name', 'Max_pool2'),...    
    convolution3dLayer(2, 128, 'Name', 'Convolution-3-1', Padding='same'),...
    reluLayer('Name',"Bridge-ReLU3-1"),...
    convolution3dLayer(3, 128, 'Name', 'Convolution-3-2', Padding='same'),...
    reluLayer('Name', "Bridge-ReLU3-2"),...
    dropoutLayer(0.5, 'Name', 'bridge_dropout'),...
    transposedConv3dLayer(3, 64, 'Name', 'TConvolution-3-3', 'Stride', [2 2 2],'Cropping', 'same' ),... %, Padding='same'),...
    reluLayer('Name', "BridgeUP-ReLU3-3_U"),...
    depthConcatenationLayer(2, 'Name', 'UpConcatenation-1'),...
    convolution3dLayer(2, 64, 'Name', 'Convolution-2-1_U', Padding='same'),...
    reluLayer('Name',"Decoder-ReLU2-1_U"),...
    convolution3dLayer(3, 64, 'Name', 'Convolution-2-2_U', Padding='same'),...
    reluLayer('Name', "Decoder-ReLU2-2_U"),...
    transposedConv3dLayer(3, 32, 'Name', 'TConvolution-2-3', 'Stride', [2 2 2], 'Cropping', 'same' ),... %, Padding='same'),...
    reluLayer('Name', "DecoderUp-ReLU3-2_U"),...
    depthConcatenationLayer(2, 'Name', 'UpConcatenation-2'),...
    convolution3dLayer(3, 32, 'Name', 'Convolution-1-1_U', Padding='same'),...
    reluLayer('Name', "Decoder-ReLU1-1_U"),...
    convolution3dLayer(3, 32, 'Name', 'Convolution-1-2_U', Padding='same'),...
    reluLayer('Name', "Decoder-ReLU1-2_U"),...
    convolution3dLayer(1,1, 'Name', 'Final_convolution', Padding='same'),...
    regressionLayer('Name', 'Regression_layer')];
net3dlg=layerGraph(net3d);  %Need to convert to a layer graph before layers can be connected.
net3dlg=connectLayers(net3dlg, 'Encoder-ReLU1-2', 'UpConcatenation-2/in2');
net3dlg=connectLayers(net3dlg, 'Encoder-ReLU2-2', 'UpConcatenation-1/in2');
%%
%Load the Brats cardiac dataset
volReader = @(x) niftiread(x);
imds = imageDatastore(trainingdatafolder, ...
    'FileExtensions','.gz','ReadFcn',volReader);
if(~exist(dataFoldernoisy, 'dir') || forcecreatetraining)
    mkdir (dataFoldernoisy)
    noisescale=256;  %empirically determine
    imdstransformnoisy=transform(imds, @(x) addnoisecalcmag(x, noisescale));
    writeall(imdstransformnoisy, dataFoldernoisy,'WriteFcn', @myniftiwrite);
end
imds_noisy = imageDatastore(dataFoldernoisy, ...
    'FileExtensions','.nii','ReadFcn',volReader, 'IncludeSubfolders', true);
numImages = numel(imds.Files);
imds=transform(imds, @(x) paddz(x, zsize));
imds_noisy=transform(imds_noisy, @(x) paddz(x, zsize));
combinedDS = combine(imds,imds_noisy);
    
numTrain = round(0.8*numImages);
numVal = round(0.2*numImages);
shuffledIndices = randperm(numImages);
dsTrain = subset(combinedDS,shuffledIndices(1:numTrain));
dsVal = subset(combinedDS,shuffledIndices(numTrain+1:numTrain+numVal));
dsTrain = transform(dsTrain,@(data) dummytransform(data));  %Does nothing for now, but does mean that dsTrain matches the 2D example
options = trainingOptions("adam", ...
        InitialLearnRate=0.001,...
        GradientDecayFactor=0.999,...
        L2Regularization=0.002, ...
        MaxEpochs=10, ...
        MiniBatchSize=128, ...
        Shuffle="every-epoch", ...
        Verbose=false,...
        VerboseFrequency=100,...
        ValidationData={dsVal.UnderlyingDatastores{1}, dsVal.UnderlyingDatastores{2}},...   
        Plots="training-progress",...
        ExecutionEnvironment="cpu",...
        ResetInputNormalization=true);
%%
[trainedNet, info] = trainNetwork(dsTrain,net3dlg,options);     %The line that fails
%%
%extra functions used
function output=addnoisecalcmag(img, noisescale)
    output=(abs(double(img)+noisescale.*(randn(size(img))+1i*randn(size(img)))));
end
function myniftiwrite(data,writeInfo,outputType)
    niftiwrite(data, writeInfo.SuggestedOutputName);
end
function paddeddata=paddz(datain, sizetopaddto)
    if(size(datain, 3)<=sizetopaddto)
        xykdata=ifftshift(fft(fftshift(datain), [],3));
        xykdata=padarray(xykdata, [0 0 ceil((sizetopaddto-size(datain,3))/2)], 0, 'both');
        paddeddata=fftshift(ifft(fftshift(xykdata), [], 3));
        paddeddata=paddeddata(:,:,1:sizetopaddto);  %in case the rounding makes it too long.
    else
        xykdata=ifftshift(fft(fftshift(datain), [],3));
        shavesize=floor((size(datain,3)-sizetopaddto)/2);
        xykdata=xykdata(:,:,shavesize:(size(xykdata, 3)-shavesize));
        paddeddata=fftshift(ifft(fftshift(xykdata), [], 3));
        paddeddata=paddeddata(:,:,1:sizetopaddto);  %in case the rounding makes it too long.
    end
    paddeddata={paddeddata};
end
function transformeddata=dummytransform(datain) %basically does nothing...
    data1=datain{1};
    data2=datain{2};
    transformeddata={double(data1), double(data2)}; 
end

4 个评论
显示 2更早的评论隐藏 2更早的评论

Siraj 2023-8-23

Hii! @Andrew Scott, when I am trying to compile the network from your given code, I am getting the following error.

Andrew 2025-6-16

Thaks @Siraj. I didn't see your message when you originally posted it. This was part of the problem that I solved by changing the size of the input to 132.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Gayathri 2025-6-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1991563-training-cnn-for-3d-image-to-image-with-combineddatastore#answer_1566428

在 MATLAB Online 中打开

Hi @Andrew Scott,

In MATLAB’s Deep Learning Toolbox, for 3D image-to-image tasks (like denoising or segmentation), the input to trainNetwork typically requires 4-D arrays with dimensions [height width depth channels], where channels is the number of channels (e.g., 1 for grayscale, 3 for RGB). For 3D images, each volume should be a 4-D array of size [320 320 130 1], where the last dimension (channels) is 1.

When you tried repmat to create [320 320 130 3] arrays, it still failed because the network expects channels=1 (grayscale) as per the architecture defined in net3dlg, not channels=3 as shown in the code below.

imageLayer = image3dInputLayer([320,320,zsize, 1],'Name', 'Image_input');

To fix the error, you need to ensure that the dsTrain datastore returns 4-D arrays of size [320 320 130 1].

Then passing this transformed "dsTrain" to "trainNetwork" will most likely resolve the error.

For more information on "trainNetwork" function, please refer to the following documentation link.

https://www.mathworks.com/help/deeplearning/ref/trainnetwork.html

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Andrew 2025-6-16

Thank you for looking at this post from a couple of years ago. I looked into this again. Some of the matlab functions have since changed. It turns out that the problem was that the arrays that were returned in the 2 column cell from dsTrain.read were complex. It seems that trainNetwork can't cope with this, but neither can it provide a useful error when that is the problem...

I had only tried to pad the arrays to [320 320 130 3] as one of the tests that I had performed to try and find the error. It wasn't useful and the final dimension of 1 was fine (although I did need to change to [320 320 132 1] to avoid a problem with rounding.

请先登录，再进行评论。

Training CNN for 3D image to image with CombinedDatastore

4 个评论
显示 2更早的评论隐藏 2更早的评论

回答（1 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Training CNN for 3D image to image with CombinedDatastore

4 个评论 显示 2更早的评论隐藏 2更早的评论

回答（1 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

4 个评论
显示 2更早的评论隐藏 2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论