Error training neural network with datastore

Question

HpW 2021-1-25

2
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/725968-error-training-neural-network-with-datastore

回答： Renee Coetsee 2022-4-1

在 MATLAB Online 中打开

Hello

I am training a LSTM for sequence to sequence labeling

I have it currently set up where XTrain is a 5000 x 1 cell array where each of the 5000 rows is a 10 x n double.

YTrain is a 5000 x 1 cell array where each of the 5000 rows is a 1 x n categorical array with 4 catagories

XTrain =
  5000×1 cell array
    {10×2371 double}
    {10×2792 double}
    {10×3044 double}
    {10×2878 double}
    {10×2790 double}
    ...
   
    
YTrain =
  5000×1 cell array
    {1×2371 categorical}
    {1×2792 categorical}
    {1×3044 categorical}
    {1×2878 categorical}
    {1×2790 categorical}
    ...
 

I have layers defined as follows with 10 inputs and 4 outputs

layers = [ ...
    sequenceInputLayer(10)
    lstmLayer(150,'OutputMode','sequence')
    fullyConnectedLayer(4)
    softmaxLayer
    classificationLayer];

In the interest of space I have omitted the options as I do not think this has anything to do with the error, but I can provide if needed.

If I train the NNet as follows, it works fine.

net = trainNetwork(XTrain, YTrain,layers,options)

However, I want to perform some transformations on the signals so I wanted to get it setup to run with a transformed datastore rather than from inputting XTrain and YTrain separately. Before getting the transform datastore working, I wanted to test that I could train the nnet using syntax that accepted the datastore intsead of XTrain and YTrain separately

I set up a datastore as follows:

dsXTrain = arrayDatastore(XTrain,'OutputType','same');
dsYTrain = arrayDatastore(YTrain,'OutputType','same');
dsTrain = combine(dsXTrain,dsYTrain);

Now, if I try to train the NN as follows, I get an error:

net = trainNetwork(dsTrain,layers,options)
Error using trainNetwork (line 183)
Unexpected input size: The input layer expects sequences with the same sequence length and feature dimension 10.
Error in train_nn (line 82)
net = trainNetwork(dsTrain,layers,options);

If I try to look at the data in the datastore it looks fine...

readall(dsTrain)
ans =
  5000×2 cell array
    {10×2371 double}    {1×2371 categorical}
    {10×2792 double}    {1×2792 categorical}
    {10×3044 double}    {1×3044 categorical}
    {10×2878 double}    {1×2878 categorical}
    {10×2790 double}    {1×2790 categorical}
    ...                  ...

which looks like XTrain and Ytrain

I cannot figure out what the exact problem is - I assume its not passing in the data from dsTrain into trainNetwork properly, but Im at a loss to figure out what specifically the error is...

Any thoughts on how to fix it?

thanks!

hpw

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

HpW 2021-1-25

编辑：HpW 2021-1-25

在 MATLAB Online 中打开

I made a fake datastore with the same signal/label repeated over multiple times and this worked. It seems that if I input as a datastore all the signals have to be the same length? This isn't the case when inputting XTrain and YTrain individually.....is this really the problem?

These are my training options in case that helps ...

options = trainingOptions('adam', ...
    'MaxEpochs',40, ...
    'MiniBatchSize',32, ...
    'SequenceLength','longest', ...
    'ValidationPatience',Inf, ...
    'ValidationFrequency',50, ...
    'InitialLearnRate',0.002, ...
    'LearnRateDropPeriod',4, ...
    'LearnRateDropFactor',0.6, ...
    'LearnRateSchedule','piecewise', ...
    'GradientThreshold',1, ...
    'Plots','training-progress',...
    'Shuffle','every-epoch',...
    'Verbose',1);

isnt the sequence length longest supossed to deal with padding the signals as needed in each minibatch?

thx

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Renee Coetsee 2022-4-1

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/725968-error-training-neural-network-with-datastore#answer_932514

This issue is because sequence padding is not supported with datastores.

You can find a solution here:

https://www.mathworks.com/help/deeplearning/ug/train-network-using-out-of-memory-sequence-data.html

If you make a transform function for your datastore, you can the pad the sequences to a specified length. Then they will all be the same length during training.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

Puru Kathuria 2021-1-31

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/725968-error-training-neural-network-with-datastore#answer_610628

在 MATLAB Online 中打开

Hi,

I tried training the network on my dataset using the cell array and datastore. Both worked fine as expected. I could not debug your code because I dont have your data. I am listing my network architecture and training options for your reference.

And yes, the property SequenceLength set as longest deals with padding sequences in each mini-batch to have the same length as the longest sequence. This option does not discard any data, though padding can introduce noise to the network.

layers = [ ...
    sequenceInputLayer(numFeatures)
    bilstmLayer(150,"OutputMode","sequence")
    bilstmLayer(150,"OutputMode","sequence")
    fullyConnectedLayer(2)
    softmaxLayer
    classificationLayer
    ];
maxEpochs = 40;
miniBatchSize = 64;
options = trainingOptions("adam", ...
    "InitialLearnRate",1e-4, ...
    "MaxEpochs",maxEpochs, ...
    "MiniBatchSize",miniBatchSize, ...
    "Shuffle","every-epoch", ...
    "Verbose",false, ...
    "ValidationFrequency",floor(numel(TrainingFeatures)/miniBatchSize), ...
    "ValidationData",{FeaturesValidationClean.',BaselineV}, ...
    "Plots","training-progress", ...
    "LearnRateSchedule","piecewise", ...
    "LearnRateDropFactor",0.1, ...
    "LearnRateDropPeriod",5);
[network,netInfo] = trainNetwork(TrainingFeatures,TrainingMasks,layers,options);
%Similarly after making datastores, I used the below command to train. 
net = trainNetwork(adsTrain,layers,options);

As far as I understand it might be a debugging issue and in case you want a reference example that demonstrates training an LSTM, you can refere this section.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

HpW 2021-2-3

在 MATLAB Online 中打开

data.mat

Thank you for the reply!

Ive tried to figure out whats wrong but no luck. I have attached a .mat file with some fake data in it -- its not what Im using to train but I cant upload that. Its some random data that wont train anything useful, but it should allow the training to proceed.

XTrainR is a 100x1 cell array containing doubles

YTrainR is a 100x1 cell array containing categorical arrays with 3 categories of the same length as XTrainR

Im using the following code:

% Combine datastore
dsXTrainR = arrayDatastore(XTrainR,'OutputType','same');
dsYTrainR = arrayDatastore(YTrainR,'OutputType','same');
dsTrainR = combine(dsXTrainR,dsYTrainR);
% Set parameters
batchsize = 12;
num_neurons = 150;
num_outputs = 3;
epo = 40;
learnrate = 0.005;
dropperiod = 4;
droprate = 0.6;
options = trainingOptions('adam', ...
    'MaxEpochs',epo, ...
    'MiniBatchSize',batchsize, ...
    'SequenceLength','longest', ...
    'InitialLearnRate',learnrate, ...
    'LearnRateDropPeriod',dropperiod, ...
    'LearnRateDropFactor',droprate, ...
    'LearnRateSchedule','piecewise', ...
    'GradientThreshold',1, ...
    'Plots','training-progress',...
    'Shuffle','every-epoch',...
    'Verbose',1);
  
layers = [ ...
    sequenceInputLayer(1)
    lstmLayer(num_neurons,'OutputMode','sequence')
    fullyConnectedLayer(num_outputs)
    softmaxLayer
    classificationLayer];

If I run

Net = trainNetwork(XTrainR, YTrainR,layers,options);

it works fine

but if I run

Net = trainNetwork(dsTrainR,layers,options);

I keep getting the same error:

Error using trainNetwork (line 183)
Unexpected input size: The input layer expects sequences with the same sequence length and feature dimension 1.

Ive also tried using your code for layers and options (with slight modification given the dataset), but still get the same error....

Am i creating the datastore incorrectly?

thanks again for your help!

hpw

请先登录，再进行评论。

Answer 3

kewei chen 2021-2-7

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/725968-error-training-neural-network-with-datastore#answer_617087

I got the same problem as yours. But still can't work it out. Hope someone can help.

3 个评论
显示 1更早的评论隐藏 1更早的评论

HpW 2021-2-7

在 MATLAB Online 中打开

But what if you dont want a minibatch of 1? In the above code I thought i was setting minibatch to 12 which should be okay for a dataset of size 100. I wonder if its some issue with creating the datastore because thats the only thing that changes between

Net = trainNetwork(XTrainR, YTrainR,layers,options);

which works and

Net = trainNetwork(dsTrainR,layers,options);

which doesnt

tx

Atallah Baydoun 2022-3-2

Was anyone able to fix this problem?

I am training a multi-path network that takes as input a 3D image and 4 features.

I got the same error and It seems I can not debug it.

请先登录，再进行评论。

Error training neural network with datastore

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

回答（3 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Error training neural network with datastore

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

回答（3 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

3 个评论 显示 1更早的评论隐藏 1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论