Why does training perfomance change when a validation set is considered?
3 次查看(过去 30 天)
显示 更早的评论
Hello!
This question is related with this http://www.mathworks.com/matlabcentral/answers/49140-is-validation-set-being-used-for-training-in-nn.
For example, I considered the input and output:
input=1:1:10 output=[1:2:15 24 24]
and then I try 3 different options:
OPTION 1 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10,1:10); [net,tr,Y1,E1] = train(net,input,output);
OPTION 2 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(10,1:8,9:10); %net.divideParam.trainRatio=1;net.divideParam.valRatio=0;net.divideParam.testRatio=0; [net,tr,Y1,E1] = train(net,input,output);
OPTION 3 rand('twister',1) net = feedforwardnet(4); net.trainParam.epochs =3; net.divideFcn='divideind'; [net.divideParam.trainInd,net.divideParam.valInd,net.divideParam.testInd] = divideind(8,1:8); [net,tr,Y1,E1] = train(net,input(:,1:8),output(:,1:8));
The initialisations are similar, the all 3 options stopped because they reached the maximum epoch. I checked epoch=0 and the weights and bias are similar but the (training) performance isn't. And from epoch=0, everything is different when comparing the 3 options. If I don't change divideFcn and I consider the same experiments as before, using the same indices for training, I have the same problem. So it isn't because of divideind! I'd like to understand why this is happening. I checked the functions step by step. Could anyone help me? Thank you very much. Ana
1 个评论
Greg Heath
2012-10-5
I took a prelimiary look. Something subtle is going on.
1. Option 1 is irrelevant.
2. I chose Nepochs = 1 and and rng(0) initialization.
3. The final weights for Options 2 & 3 are different (They shouldn't be).
I'll be baahk.
Aahnold.
采纳的回答
Greg Heath
2012-11-28
The difference in the last two results was completely caused by using
1) ... = train(net,input(:,1:8),output(:,1:8));
instead of
2) ... = train(net,input,output);
Verification: For each of these 2 syntaxes I ran 3 trials for one epoch with
a. divideind(10,1:8,9:10);
b. divideind(10,1:8);
c. divideind(8,1:8);
For each syntax the 3 trials yielded identical results.
The reason why probably lies in the code of train:
type train
Hope this helps.
Thank you for officially accepting my answer.
Greg
0 个评论
更多回答(1 个)
Zeeshan
2012-11-27
Hi,
I think because the data is divided randomly to check for validation of model, therefore some network may get trained better than the other because it was trained on a different set of data (randomly chosen training data).
I am also working on a comparison of architectures and I am going to fix the time points for each dataset for training and validation to compare them.
Regards,
Shan
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!