Info

此问题已关闭。 请重新打开它进行编辑或回答。

Neural network: train() behavior with earlier results

2 次查看(过去 30 天)
I have a very large dataset of around 150GB that I need to process using neural networks. As this data is quite big, I've to break it into chunks, say 5000 elements are sent as 20 batches, each batch containing 250 elements. The following dummy code can be written for this:
for count = 1:num_batches
inputs = entire_input(1 + (count-1)*num_batches, count * num_batches);
targets = entire_targets(1 + (count-1)*num_batches, count * num_batches);
net = train(net, inputs, targets);
end
Will the net again start training with the fresh batch, or will it be able to retain weights calculated for previous batch? As per some of my discussions and findings, with each new batch, the weights start taking shape of current data and may overwrite previous weights.
Please advise if this method works well, or we can use some other method instead of train().

回答(1 个)

Greg Heath
Greg Heath 2018-1-29
"Need to process" doesn't provide useful information.
What are you trying to design? Curvefitter/Regressor? PatternRecognizer/Unsupervised-Classifier/Supervised-Classifier? Timeseries??
In all cases, training, validation and test data should have similar summary statistics in all run batches. Otherwise training batch n will erase some of what is learned in batches 1 to n-1.
Your response should be far less vague than your original explanation.
Hope this helps.
Greg
Thank you for formally accepting my answer
  1 个评论
Akshay Joshi
Akshay Joshi 2018-1-29
编辑:Akshay Joshi 2018-1-29
Hi Greg,
I'm trying to design a supervised classifier with the help of multi layer perceptrons ( feedforwardnet). The input matrix is of 500,000 x 25 dimension, and output matrix 5,000 x 25.
Initially, I tried to train my network using nntool. But I was unable to feed dataset this large (150 GB) into it due to memory constraints, so decided to break data into chunks. For this purpose, I'm writing a Matlab script to create neural network and provide input in chunks.
In all cases, training, validation and test data should have similar summary statistics in all run batches.
Otherwise training batch n will erase some of what is learned in batches 1 to n-1.
Can you suggest some method through which we can retain the data of 1 to n-1 batches, and based on that, we calculate the result of say n to n+k batches?
Thanks for the earlier response. Hope I'm clear this time.

此问题已关闭。

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by