Sequence length for the LSTM training

115 次查看(过去 30 天)
Hello, I am trying to develop a lstm network that can predict the time series that matches to result of dynamical simulation.
I have number of questions regarding LSTM.
1) Is it possible to use a very long sequence length (around like 100,000 time steps with 12 features) for the LSTM as long as memory allows?
Or shall I have to split the long time series into many number of observations?
As I do have many observations (20) already (which means 20 cells with each cells have matrix of 12 X 100,000),
I was wondering if it is okay to train a network using a very long sequence without any data preprocessing as in: https://www.mathworks.com/help/deeplearning/ug/sequence-to-sequence-regression-using-deep-learning.html
2) Another question, my dynamical system seems to have 7, 8 sec period.
Given that each samples have 0.1 sec time interval, I would like to focus primarily on last 70 to 80 cells, instead of the whole sequence.
Without data preprocessing, is there any way I can limit the size of "window" for the LSTM?
I know there is "sequence length" in training option, but I am not sure if it is specifically related to the window which I was reffering.
Thank you very much for your help in advance!

回答(1 个)

Sandeep
Sandeep 2023-5-24
Hi byungchan,
It is my understanding that you have questions on LSTM with regard to having long sequence length.
1) While it is technically possible to use a very long sequence length for LSTM, it is not always the most optimal approach. Using very long sequences can lead to many challenges such as vanishing gradients, which can make it difficult to learn dependencies across the long sequence. Additionally, training on such long sequences requires a large amount of memory which may become impossible for GPU with low memory capacity resulting in out of memory errors. Splitting the sequence into shorter sub-sequences is a common approach to address these challenges.
In your specific case, it may be better to split the long time series into shorter sequences and experiment with different lengths to optimize your model's performance. This would also make the training faster and more tractable since you would only be training your model on shorter sequences at a time. It is up to you to determine how best to split and format your data based on your model's requirements and the specific features of your data.
2) Yes, you can change the sequence length parameter in your LSTM model to control the number of time steps the network processes input data at a time. This can be done by modifying the training options passed to the trainNetwork function.
Specifically, you can set the sequence length to match the number of time steps you want to focus on in your data. If you want to focus on the last 70 to 80 time steps, and each time step corresponds to 0.1 seconds, you can set the sequence length to be 7 or 8, depending on the size of your input data.
% sample implementation
sequenceLength = 7;
miniBatchSize = 64;
numHiddenUnits = 100;
layers = [ ...
sequenceInputLayer(inputSize)
lstmLayer(numHiddenUnits,'OutputMode','last')
fullyConnectedLayer(outputSize)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', ...
'MaxEpochs',100, ...
'MiniBatchSize',miniBatchSize, ...
'SequenceLength',sequenceLength, ... % set the sequence length here
'GradientThreshold',1, ...
'Shuffle','never', ...
'Plots','training-progress');
net = trainNetwork(XTrain,YTrain,layers,options);
Note that changing the sequence length parameter can affect the accuracy of your LSTM model.
For more information refer the documentation page MATLAB trainNetwork (mathworks.com)

类别

Help CenterFile Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by