Image semantic segmentation dimension layer why not match input and output

Question

Raúl Rivera 2024-2-18

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2083548-image-semantic-segmentation-dimension-layer-why-not-match-input-and-output

编辑： Matt J 2024-2-18

Hello , I want to apply a semantic segmentation algorithm to a batch of images , to which I want to detect their failures (7 types of them) with a cnn, so I used Image labeler to detect 7 classes of failures for 80 images (dont know if thats enough data, the size of the images is [743 1324 3], but the output does not match. How can I make them match. I attach the code I use and linked the files of images and its labels.Thanks

load pixel_label_training.mat 
%[imds,pxdsTruth] = pixelLabelTrainingData(gTruth.mat)
trainingData = combine(imds,pxdsTruth);
inputSize=[743 1324 3]
inputSize = 1×3
         743        1324           3
imgLayer = imageInputLayer(inputSize)
imgLayer = 
  ImageInputLayer with properties:

                      Name: ''
                 InputSize: [743 1324 3]
        SplitComplexInputs: 0

   Hyperparameters
          DataAugmentation: 'none'
             Normalization: 'zerocenter'
    NormalizationDimension: 'auto'
                      Mean: []
filterSize = 4;
numFilters = 30;
conv = convolution2dLayer(filterSize,numFilters,'Padding',1);
relu = reluLayer();
poolSize = 2;
maxPoolDownsample2x = maxPooling2dLayer(poolSize,'Stride',2,'Padding',[1 1]);
downsamplingLayers = [
    conv
    relu
    maxPoolDownsample2x
    conv
    relu
    maxPoolDownsample2x
    ]
downsamplingLayers = 
  6×1 Layer array with layers:

     1   ''   2-D Convolution   30 4×4 convolutions with stride [1  1] and padding [1  1  1  1]
     2   ''   ReLU              ReLU
     3   ''   2-D Max Pooling   2×2 max pooling with stride [2  2] and padding [1  1  1  1]
     4   ''   2-D Convolution   30 4×4 convolutions with stride [1  1] and padding [1  1  1  1]
     5   ''   ReLU              ReLU
     6   ''   2-D Max Pooling   2×2 max pooling with stride [2  2] and padding [1  1  1  1]
filterSize = 1;
transposedConvUpsample2x = transposedConv2dLayer(filterSize,numFilters,'Stride',2,'Cropping',1)
transposedConvUpsample2x = 
  TransposedConvolution2DLayer with properties:

            Name: ''

   Hyperparameters
      FilterSize: [1 1]
     NumChannels: 'auto'
      NumFilters: 30
          Stride: [2 2]
    CroppingMode: 'manual'
    CroppingSize: [1 1 1 1]

   Learnable Parameters
         Weights: []
            Bias: []

Use properties method to see a list of all properties.
upsamplingLayers = [
    transposedConvUpsample2x
    relu
    transposedConvUpsample2x
    relu
    ]
upsamplingLayers = 
  4×1 Layer array with layers:

     1   ''   2-D Transposed Convolution   30 1×1 transposed convolutions with stride [2  2] and cropping [1  1  1  1]
     2   ''   ReLU                         ReLU
     3   ''   2-D Transposed Convolution   30 1×1 transposed convolutions with stride [2  2] and cropping [1  1  1  1]
     4   ''   ReLU                         ReLU
numClasses = 7;
conv1x1 = convolution2dLayer(1,numClasses);
finalLayers = [
    conv1x1
    softmaxLayer()
    pixelClassificationLayer()
    ]
finalLayers = 
  3×1 Layer array with layers:

     1   ''   2-D Convolution              7 1×1 convolutions with stride [1  1] and padding [0  0  0  0]
     2   ''   Softmax                      softmax
     3   ''   Pixel Classification Layer   Cross-entropy loss 
net = [
    imgLayer    
    downsamplingLayers
    upsamplingLayers
    finalLayers
    ]
net = 
  14×1 Layer array with layers:

     1   ''   Image Input                  743×1324×3 images with 'zerocenter' normalization
     2   ''   2-D Convolution              30 4×4 convolutions with stride [1  1] and padding [1  1  1  1]
     3   ''   ReLU                         ReLU
     4   ''   2-D Max Pooling              2×2 max pooling with stride [2  2] and padding [1  1  1  1]
     5   ''   2-D Convolution              30 4×4 convolutions with stride [1  1] and padding [1  1  1  1]
     6   ''   ReLU                         ReLU
     7   ''   2-D Max Pooling              2×2 max pooling with stride [2  2] and padding [1  1  1  1]
     8   ''   2-D Transposed Convolution   30 1×1 transposed convolutions with stride [2  2] and cropping [1  1  1  1]
     9   ''   ReLU                         ReLU
    10   ''   2-D Transposed Convolution   30 1×1 transposed convolutions with stride [2  2] and cropping [1  1  1  1]
    11   ''   ReLU                         ReLU
    12   ''   2-D Convolution              7 1×1 convolutions with stride [1  1] and padding [0  0  0  0]
    13   ''   Softmax                      softmax
    14   ''   Pixel Classification Layer   Cross-entropy loss 
opts = trainingOptions('sgdm', ...
    'InitialLearnRate',1e-3, ...
    'MaxEpochs',100, ...
    'MiniBatchSize',1);
net = trainNetwork(trainingData,net,opts);

images:Images

labels: pixelLabelData

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Matt J 2024-2-18

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2083548-image-semantic-segmentation-dimension-layer-why-not-match-input-and-output#answer_1411223

编辑：Matt J 2024-2-18

在 MATLAB Online 中打开

Below is what analyzeNetwork gives. Only you know what you want the dimensions of each of your activation maps to be. Tell us at what point in the network it starts to deviate from what you intended.

analyzeNetwork(net)

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Image semantic segmentation dimension layer why not match input and output

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Image semantic segmentation dimension layer why not match input and output

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论