Custom performance vectors for neural network training

Question

Harold 2013-5-8

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/75098-custom-performance-vectors-for-neural-network-training

I'm working on pattern recognition using MATLAB's built in neural network toolbox. I've used this toolbox to generate code. I've successfully implemented this in a working gui. The problem that I am trying to solve now is to let the user select vectors for validation and testing from a file. For example, I'm training the network to recognize 4 letters "ABCD". I've been reading in documentation that validation samples are used to measure network generalization; i.e found out how my network would perform on data it has never seen before. There's also testing samples which are used to give an independent measure of network performance during and after training; used to determing when to stop training.

I would still like to use these. A work around to this is to combine my training, validation, and testing vectors into one matrix. I then use this as my training matrix and use the code below to separate the vectors back out. Train, Val, and Test can be determined by using size() for each vector (the original training vector, validation, and testing). The matrix data contains the original training vector, validation vector, and testing vector column-wise.

% Setup Division of Data for Training, Validation, Testing
% For a list of all data division functions type: help nndivide
net.divideFcn = 'dividerand';  % Divide data randomly
net.divideMode = 'sample';  % Divide up every sample
net.divideParam.trainRatio = Train/size(data,2);
net.divideParam.valRatio = Val/size(data,2);
net.divideParam.testRatio = Test/size(data,2);

The one problem that I see with this is decimal ratios. For example, let the size of the original training vector, the validation vector, and the testing vector all be 10x1 (the rows contain the numbers for the network, the columns are the number of sample sets). This would mean that the trainRatio, valRatio, and testRatio would all be 3.333333333333333e-01. I'm not sure if MATLAB will split data up into three parts without throwing an error because of the decimal.

Any thoughts on this or work arounds?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Greg Heath 2013-5-10

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/75098-custom-performance-vectors-for-neural-network-training#answer_84980

编辑：Greg Heath 2013-5-10

在 MATLAB Online 中打开

% I've been reading in documentation that validation samples are used to measure network generalization; i.e found out how my network would perform on data it has never seen before.

NO. See Below.

%There's also testing samples which are used to give an independent measure of network performance during and after training; used to determing when to stop training.

1. total = design + test

2. design = training + validation

3. training:

 a. used to obtain weight values given training parameters.
 b. training error estimates tend to be extremely biased as the number 
    of unknown weights, Nw, increases toward the number of training 
    equations, Ntrneq.
 c. Ndof = Ntrneq-Nw is the number of estimation degrees of freedom
   (See Wikipedia). As long as Ndof  is sufficiently positive, the bias 
   of estimating error with training data can be mitigated, somewhat,
   by using the degree of freedom adjustment of dividing SSEtrn by Ndof
   instead of Ntrneq.

4. validation:

 a. used repeatedly with the training set to determine a good  set of training parameters (especially the stopping epoch) via choosing the best of multiple random initial weight designs.
 b. Validation set error tends to be much less biased than training set error, especially if training doesn't stop because of validation error convergence.

5. test:

 a.used once, and only once to obtain an unbiased error estimate of          nontraining data.
 b. if performance is unsatisfactory and more designs are necessary, the  data should be repatitioned into new tr/val/tst subsets.

%I would still like to use these. A work around to this is to combine my training, validation, and testing vectors into one matrix.

No. This not a work around. 'dividerand' is the default.

MATLAB uses

 Ntst = round(tstratio*N)
 Nval = round(valratio*N)
 Ntrn = N - Nval - Ntst.

The training record, tr, contains the indices for each subset.

[ net tr Y E ] = train(net,x,t);

Y is the output and E is the error E = t-Y.

If you want , you can use dividerand anytime before training to obtain the indices then assign the indices to the net by using divideind.

Hope this helps.

*Thank you for formally accepting my answer

Greg

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Harold 2013-5-10

Thank you Greg, I will have to try dividerand before training like you suggested. As of right now, I do not do any kind of validation or testing.

请先登录，再进行评论。

Custom performance vectors for neural network training

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

Custom performance vectors for neural network training

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论