Hi,
There seems to be a mismatch between expected inputs and actual inputs to the yolov2TransformLayer. Based on the "RotulosVagem.mat" and "lgraph" provided by you, I assume you want to train a YOLO v2 network with 2 anchor boxes for 1 class.
For this, the last convolutional layer before yolov2TransformLayer in the "lgraph" must have 12 output filters but the current network is having 20 filters.
The issue can be resolved by updating the output filters of the last convolutional layer. You can try the following code:
lgraph = lgraph.lgraph;
[imds, blds] = objectDetectorTrainingData(gTruth);
ds = combine(imds, blds);
options = trainingOptions('sgdm');
% % Start of the code to be added %%
numClasses= size(vagem.gTruth.LabelData,2);
numAnchorBoxes = size(lgraph.Layers(end,1).AnchorBoxes,1);
outFilters = (5+numClasses).*numAnchorBoxes;
yolov2ConvLayer = convolution2dLayer(3,outFilters,'Name','yolov2ConvUpdated',...
'Padding', 'same',...
'WeightsInitializer',@(sz)randn(sz)*0.01);
yolov2ConvLayer.Bias = zeros(1,1,outFilters);
lgraph = replaceLayer(lgraph,'yolov2ClassConv',yolov2ConvLayer);
% % End of the code to be added %%
[detector,info] = trainYOLOv2ObjectDetector(ds,lgraph,options);