How to use trainNetwork function for input as a video?
1 次查看(过去 30 天)
显示 更早的评论
I have written code to recognize characters 'A' & 'B'. But during training , i got following error.
Error using trainNetwork
Invalid training data. Predictors must be a numeric array, a datastore, or a table. For networks with sequence input, predictors can also be
a cell array of sequences.
Error in trialjune25_2022 (line 77)
[netLSTM,info] = trainNetwork(imdsTrainData',labelsTrain,layers,options);
____
size(imdsTrainData')=> 37 x 1 cell array
size(labelsTrain)=> 37 x 1 cell array.
Please help to resolve the error.
clear all;
close all;
clc;
idx = 1;
files={'trainsp1_A1.avi';'trainsp1_A2.avi';'trainsp1_A3.avi';'trainsp1_A4.avi';'trainsp1_A5.avi';'trainsp1_A6.avi';'trainsp1_A7.avi'; ...
'trainsp2_A1.avi';'trainsp2_A2.avi';'trainsp2_A3.avi';'trainsp2_A4.avi';'trainsp2_A5.avi';'trainsp2_A6.avi';'trainsp2_A7.avi'; ...
'trainsp3_A1.avi';'trainsp3_A2.avi';'trainsp3_A3.avi';'trainsp3_A4.avi';'trainsp3_A5.avi';'trainsp3_A6.avi';'trainsp3_A7.avi'; ...
'trainsp1_B1.avi';'trainsp1_B2.avi';'trainsp1_B3.avi';'trainsp1_B4.avi';'trainsp1_B5.avi';'trainsp1_B6.avi';'trainsp1_B7.avi'; ...
'trainsp2_B1.avi';'trainsp2_B2.avi';'trainsp2_B3.avi';'trainsp2_B4.avi';'trainsp2_B5.avi';'trainsp2_B6.avi';'trainsp2_B7.avi'; ...
'trainsp3_B1.avi';'trainsp3_B2.avi';'trainsp3_B3.avi';'trainsp3_B4.avi';'trainsp3_B5.avi';'trainsp3_B6.avi';'trainsp3_B7.avi'; ...
};
% labels1=categorical([ones(1,21) 2*ones(1,21)]);% 2*ones(1,8) 3*ones(1,8) 4*ones(1,8) 5*ones(1,7) 6*ones(1,3) 7*ones(1,8) 8*ones(1,6) 9*ones(1,6)]);
% labels=labels1';
labels={1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,...
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2};
numFiles = numel(files)
% sequences = cell(numFiles,1);
for i = 1:numFiles
fprintf("Reading file %d of %d...\n", i, numFiles)
video{i} = readVideo(files{i});
end
numObservations = numel(video);
idx = randperm(numObservations);
N = floor(0.9 * numObservations);
idxTrain = idx(1:N);
imdsTrainData = video(idxTrain);
%imdsTrainData=imdsTrainData1';
labelsTrain1 = labels(idxTrain);
labelsTrain=labelsTrain1';
%imdsTrain={imdsTrainData' labelsTrain};
idxValidation = idx(N+1:end);
imdsValidationData = video(idxValidation);
%imdsValidationData=imdsValidationData1';
labelsValidation1 = labels(idxValidation);
labelsValidation=labelsValidation1';
%imdsValidation={imdsValidationData' labelsValidation};
% %_______________
%imdsTrain=
% imdsValidation
layers = [
imageInputLayer([38 62 3])
convolution2dLayer(3,8,'Padding','same')
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)
convolution2dLayer(3,16,'Padding','same')
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)
convolution2dLayer(3,32,'Padding','same')
batchNormalizationLayer
reluLayer
fullyConnectedLayer(2)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', ...
'InitialLearnRate',1e-4, ...
'GradientThreshold',2, ...
'Shuffle','every-epoch', ...
'ValidationData',{imdsValidationData,labelsValidation}, ...
'ValidationFrequency',5, ...
'Plots','training-progress', ...
'Verbose',false);
[netLSTM,info] = trainNetwork(imdsTrainData',labelsTrain,layers,options);
回答(1 个)
Garmit Pant
2022-6-30
Hello Shilpa
It is my understanding that you want to build a classifier that takes a video input. You are using the 'trainNetwork' function and in trying to do so it is throwing an error.
As the error suggests, trainNetwork only accepts image, sequence or feature data in the form of datastore objects, cell array of numerical arrays or numerical array. The input that you are passing to the function is imdsTrainData that stores frame data and is defined as:
imdsTrainData = video(idxTrain);
As mentioned earlier, imdsTrainData stores instances from the video array that store data from the frames of the different files. To use trainNetwork, you'd need to pass one of the datatypes mentioned above. You can do so by converting your videos to sequences of feature vectors, This can be done by extracting the output of the activations function on the last pooling layer of the GoogLeNet network ("pool5-7x7_s1").
inputSize = netCNN.Layers(1).InputSize(1:2);
layerName = "pool5-7x7_s1";
tempFile = fullfile(tempdir,"hmdb51_org.mat");
if exist(tempFile,'file')
load(tempFile,"sequences")
else
numFiles = numel(files);
sequences = cell(numFiles,1);
for i = 1:numFiles
fprintf("Reading file %d of %d...\n", i, numFiles)
video = readVideo(files(i));
video = centerCrop(video,inputSize);
sequences{i,1} = activations(netCNN,video,layerName,'OutputAs','columns');
end
save(tempFile,"sequences","-v7.3");
end
You can refer the following tutorial to read more about video classification tasks using deep learning in MATLAB: https://www.mathworks.com/help/deeplearning/ug/classify-videos-using-deep-learning.html
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Image Data Workflows 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!