Creating and using Datastore for LSTM time sequence data

7 次查看(过去 30 天)
I have time sequence data files more than 10000 numbers stored individually at csv files. Each sequence data file consists of a sample of data from 6300 features taken at 5 time sequences. Each column is a measurement data from a feature. The labels are stored in separate file sequencially.
-0.7 -1.7 -5.09 -4.79 ....
-0.7 -1.7 -5.09 -4.79 ....
-1.06 -1.59 -5.08 -4.76 .....
-1.42 -1.86 -5.61 -4.86 ....
-1.34 -2.01 -5.1 -4.62 .....
numFeatures= 6300;
numHiddenUnits = 100;
numClasses = 3;
layers = [ ...
sequenceInputLayer(numFeatures)
lstmLayer(numHiddenUnits,'OutputMode','last')
fullyConnectedLayer(numClasses)
softmaxLayer
classificationLayer];
options = trainingOptions('adam', ...
'MiniBatchSize',20,...
'MaxEpochs',10, ...
'Shuffle','once',...
'GradientThreshold',0.001, ...
'Verbose',1, ...
'Plots','training-progress');
I want to use the data for LSTM classification. I could not load all the data for training purpose.
Matlab asks for cell data for each time sequence sample data for training.
So, How can I load the files and train the network using the datastore for such large data?

采纳的回答

Angelo Yeo
Angelo Yeo 2024-2-11
tabularTextDatastore supports to manage a large set of "csv" files. To quote from the doc:
Use a TabularTextDatastore object to manage large collections of text files containing column-oriented or tabular data where the collection does not necessarily fit in memory.
  1 个评论
Narayan
Narayan 2024-2-12
Thank you Mr. Angelo.
1. I would like to know further how can the datastore be used for training such that it selects the minibatches itself.
2. The best ways to store the labels for the training data for classification. So, there will be no miss match during shuffling of data.

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Text Data Preparation 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by