Simulating mini_batches with shallow NN train function
5 次查看(过去 30 天)
显示 更早的评论
I have large datasets. 200X840000 inputs and 6X840000 targets. How could I use Train to train on say 5000 at a time, across the entire data set and keep performance up across the whole data set so that I don't nessasarily have to worry about handling all of the data at once. Kind of like mocking the mini-batch technique of deep training but for shallow training on huge data sets. Below is what I have come up with.
rng('shuffle');
neurons = 12;
epochs = 2;
miniBatchSize = 3000;
miniBatch = single([]);
TrainI = [];
TrainT = [];
[AllTrain,AllTest] = dividerand(GenTogAllData, 0.91, 0.09);
net = fitnet(neurons);
net.trainfcn = 'trainscg';
net.trainParam.showWindow=1;
net.trainParam.epochs=1;
ii = 1;
k=1;
p =0;
tic
for i = 1:epochs
p = p+1
k = k+1;
j = 1;
ii = 1;
randomNumbers = randperm(size(AllTrain,2));
while j <=size(AllTrain,2)
miniBatch(:,ii)=single(AllTrain(:,randomNumbers(j)));
j = j+1;
ii = ii+1;
if size(miniBatch,2) == miniBatchSize
TrainI = miniBatch(1:(size(AllTrain, 1)-parameters), :);
TrainT = miniBatch((size(AllTrain, 1)-parameters+1):(size(AllTrain, 1)), :);
net = train(net, TrainI, TrainT);
miniBatch = [];
TrainI = [];
TrainT = [];
ii = 1;
elseif size(miniBatch) <= miniBatchSize
end
end
It runs much quicker per epoch, as I have them, vs an epoch of the entire data, but the best behavior I reach like this is never as good as when I allow the network to train on the entire data set for a long time. I know it is batch training one epoch at a time, you can easily try adapt as well and establish your own performance criteria and it still doesnt do as well.
Is there a fundamental reason why we might not be able to this? I will have more data than I can fit in RAM soon and want to achieve the performance I know the shallow NN can across the entire data set but in smaller batches. This relates to another question I asked in which I cant fit all of these 840000 on a gpu but can fit 300000. so then how would I train 300000 at a time and still keep performance across 840000??
I know I can use some Deep NN tools for help here, but I am bout to ask another question about how I might try that too, and want to keep this on how to use shallow NN to achieve this because I know the shallow NN performs on this data set, and the deep NN stuff is its own beast.
Thank you in advance for any help here.
2 个评论
Greg Heath
2018-8-27
When you have very large datasets an excellent approach is to FIRST consider reduction of BOTH number and dimensionality.
Consider a 1-D Gaussian distribution. How many random draws are necessary for an acceptable estimate of its mean and covariance matrix? How does that change for 2 and 3-D ?
Hope this helps.
Greg
回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!