High training error at the beginning of training the Convolutional neural network

Question

Alaa 2016-3-16

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/273895-high-training-error-at-the-beginning-of-training-the-convolutional-neural-network

回答： Greg Heath 2017-4-28

In the Convolutional neural network, I'm working on training CNN, and during the training process especially at the beginning of my training I get extremely high training error after that this error starts go down slowly. After approximately 500 Epochs the training error comes near to zero (e.g. 0.006604). Then, I took the final obtained model to measure its accuracy against the testing data, I've got about 89.50%. Does that normal? I mean getting a high training error rate at the very beginning of my training process. Another thing, I'd like to mention is that I've noticed that every time i decrease the number of the hidden nodes the results become better at the end of my training.

My CNN structure is:

     config.forward_pass_scheme = {'conv_v', 'pool', 'conv_v', 'pool', 'conv_v', 'pool', 'conv_v','full', 'full', 'full', 'out'};

Here are some of my hyper parameters:

      config.learning_rate = 0.01;
      config.weight_range = 2;
      config.decay = 0.0005;
      config.normalize_init_weights = 1;
      config.dropout_full_layer = 1;
      config.optimization = 'adagrad';

Your help and suggestion in this regard is highly appreciated, thank you in advance.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Greg Heath 2017-4-28

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/273895-high-training-error-at-the-beginning-of-training-the-convolutional-neural-network#answer_264848

在 MATLAB Online 中打开

>during the training process especially at the beginning of my training I >get extremely high training error after that this error starts go down >slowly. After approximately 500 Epochs the training error comes near to >zero (e.g. 0.006604). Then, I took the final obtained model to measure >its accuracy against the testing data, I've got about 89.50%. Does that >normal?

That is not unusual.

>I mean getting a high training error rate at the very beginning of my >training process.

Yes. It's not unusual

>Another thing, I'd like to mention is that I've noticed that every time i >decrease the number of the hidden nodes the results become better at the >end of my training.

This, also is not unusual. It often occurs when an overfit net (i.e., see below for H > Hub and H >> Hub) is overtrained.

Assume

 [ I N ] = size(input)  % "I"nput matrix
 [ O N ] = size(target) % "O"utput target matrix
 [ O N ] = size(output) % "O"utput matrix
  Ntrn   = 0.7*N        % Default value for number of training inputs
 Ntrneq  = Ntrn*O       % Number of training equations
H      = numberofhiddennodes
[ H I ] = size(IW)    % IW = inputweightmatrix
[ H 1 ] = size(b1)    % B1 = inputbiasvector 
[ O H ] = size(LW)    % LW = layerweightmatrix
[ 1 H ] = size(B2)    % B1 = outputbiasvector
 Then, the number of unknown weights is
 Nw = (I+1)*H + (H+1)*O
 The number of unknowns exceeds the number of equations when
 Nw > Ntrneq 
 or
 H > Hub 
 where the upper bound is
 Hub = (Ntrneq-O)/(I+O+1)

When H > Hub there are two common ways to mitigate this.

 1. STOPPED TRAINING: Stop training when the error on a validation subset 
 increases for a specified number of epochs. The default in the NN Toolbox is
 a. Automatic trn/val/tst subset data division in the ratios 0.7/0.15/0.15 
 b. A 6 epoch limit of continuous val subset error increases.
 2. REGULARIZATION: Using a trn/val/tst subset data division in the ratio 
 0.85/0.0/0.15 with the training function TRAINBR that uses a "regularized"
 error function consisting of a weighted sum of sum-squared-error and 
 sum-squared weights.

3. Other useful search terms in the NEWSGROUP, ANSWERS and the internet are

   a. OVERFITTING
   b. OVERTRAINING

Hope this helps.

Thank you for formally accepting my answer

Greg

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

High training error at the beginning of training the Convolutional neural network

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

High training error at the beginning of training the Convolutional neural network

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论