Is validation set being used for training in NN?
1 次查看(过去 30 天)
显示 更早的评论
I'm using the Neural Network toolbox and the "divideind" function. I split the all set into train and validation sets to use the early stopping criteria:
net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10000,1:7000,7001:10000);
The thing is I thought the training was performed just with the training set, and each epoch the validation performance was computed, but I realised the NN is being influenced by the all set (training+validation). I really don't know how and I'd like to change it! I tried: [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10000,1:7000,7001:10000); [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(7000,1:7000); and they give different results (for the same epochs). Using the all set for training the results are more similar but yet different: [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10000,1:10000); And because of this I'm computing the test error separately to be sure the test set is not being used to training!
Do you know what is happening? Do you think I can do it as I need, i.e. 7000 just for training and 3000 just for validation? Thank you!
0 个评论
采纳的回答
Greg Heath
2012-9-26
% The short answer is no.
> I'm using the Neural Network toolbox and the "divideind" function. I
> split the all set into train and validation sets to use the early
> stopping criteria:
> net.divideFcn='divideind';
> [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd]
> = divideind(10000,1:7000,7001:10000);
% If you are trying to debug, use a very small data sample and omit command
% ending semicolons
>> net =fitnet;
net.divideFcn='divideind';
[net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd]
= divideind(10,1:7,8:10) % Small data set & No semicolon
ans = []
= divideind(10,1:7,8:10)
|
Error: The expression to the left of the equals sign is not a valid target for an assignment.
% I cannot get your syntax to work. What version do you have?
% On the other hand
>> net =fitnet;
net.divideFcn='divideind';
[trainInd,valInd,testInd] = divideind(10,1:7,8:10)
trainInd = 1 2 3 4 5 6 7
valInd = 8 9 10
testInd = []
% These can be assigned to the net separately
> The thing is I thought the training was performed just with the training
> set, and each epoch the validation performance was computed, but I
> realised the NN is being influenced by the all set (training+validation).
% You are mistaken.
%The validation set affects ONLY the stopping epoch.
> I really don't know how and I'd like to change it! I tried:
> [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd]
> = divideind(10000,1:7000,7001:10000);
> [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd]
> = divideind(7000,1:7000);
> and they give different results (for the same epochs).
% That is because you did not use the same random number seed (help rng)
> Using the all set for training the results are more similar but yet different:
> [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd]
> = divideind(10000,1:10000); And because of this I'm computing the test
> error separately to be sure the test set is not being used to training!
> Do you know what is happening? Do you think I can do it as I need, i.e.
> 7000 just for training and 3000 just for validation?
% You can safely include all three subsets in the division. They work as they should.
% Use the structure tr when training and, after training, omit the semicolon to reveal its contents.
[ net tr ] = train(net,input,output);
tr = tr
Hope this helps.
Thank you for formally accepting my answer.
Greg
3 个评论
Greg Heath
2012-9-27
>I disagree. I made this simple experiment: I considered 7000 examples, >and more 3000 different examples which together with the 7000 lead to a >bad learning. I considered:
>1. As input I considered 7000, all for training.
>2. As input I considered 10000, 7000 for training and 3000 for validation.
3. As input I considered 10000, all for training.
>All the results are different.
I'm not surprised.
>but 2/3 are much more worst then 1. If the training is not considering >the 3000 examples that I consider for validation, then the training >performance should be similar for 1 and 2. And it is not. Thank you very >much. Ana
>Sizes of trn/val/tst matrices?
What are you using as an unbiased test set? Do you must have more than 10,000 cases?
Oh! You are using the combined val/tst combination as a BIASED test set?
Did you check the training window to see why each experiment was terminated?
Assuming you specified the same random number seed to create the initial weights,
1&3. Successful training procedes to training set convergence or maximum number of epochs.
2. Successful training procedes until one of the two above conditions are reached OR validation error hits a minimum (beyond which the next 6 val errors are monotonically increasing)
In the real world,
1. Convergence may be due to a nonglobal local min.
2. Training may terminate because of other reasons (e.g.,
a. maximum mu reached
b. minimum gradient reached
Hope this helps.
Greg
P.S. I always train 10 nets and look at the combined tabulations. Search for examples using the keywords heath close clear Ntrials
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!