High training error at the beginning of training the Convolutional neural network
3 次查看(过去 30 天)
显示 更早的评论
In the Convolutional neural network, I'm working on training CNN, and during the training process especially at the beginning of my training I get extremely high training error after that this error starts go down slowly. After approximately 500 Epochs the training error comes near to zero (e.g. 0.006604). Then, I took the final obtained model to measure its accuracy against the testing data, I've got about 89.50%. Does that normal? I mean getting a high training error rate at the very beginning of my training process. Another thing, I'd like to mention is that I've noticed that every time i decrease the number of the hidden nodes the results become better at the end of my training.
My CNN structure is:
config.forward_pass_scheme = {'conv_v', 'pool', 'conv_v', 'pool', 'conv_v', 'pool', 'conv_v','full', 'full', 'full', 'out'};
Here are some of my hyper parameters:
config.learning_rate = 0.01;
config.weight_range = 2;
config.decay = 0.0005;
config.normalize_init_weights = 1;
config.dropout_full_layer = 1;
config.optimization = 'adagrad';
Your help and suggestion in this regard is highly appreciated, thank you in advance.
0 个评论
回答(1 个)
Greg Heath
2017-4-28
>during the training process especially at the beginning of my training I >get extremely high training error after that this error starts go down >slowly. After approximately 500 Epochs the training error comes near to >zero (e.g. 0.006604). Then, I took the final obtained model to measure >its accuracy against the testing data, I've got about 89.50%. Does that >normal?
That is not unusual.
>I mean getting a high training error rate at the very beginning of my >training process.
Yes. It's not unusual
>Another thing, I'd like to mention is that I've noticed that every time i >decrease the number of the hidden nodes the results become better at the >end of my training.
This, also is not unusual. It often occurs when an overfit net (i.e., see below for H > Hub and H >> Hub) is overtrained.
Assume
[ I N ] = size(input) % "I"nput matrix
[ O N ] = size(target) % "O"utput target matrix
[ O N ] = size(output) % "O"utput matrix
Ntrn = 0.7*N % Default value for number of training inputs
Ntrneq = Ntrn*O % Number of training equations
H = numberofhiddennodes
[ H I ] = size(IW) % IW = inputweightmatrix
[ H 1 ] = size(b1) % B1 = inputbiasvector
[ O H ] = size(LW) % LW = layerweightmatrix
[ 1 H ] = size(B2) % B1 = outputbiasvector
Then, the number of unknown weights is
Nw = (I+1)*H + (H+1)*O
The number of unknowns exceeds the number of equations when
Nw > Ntrneq
or
H > Hub
where the upper bound is
Hub = (Ntrneq-O)/(I+O+1)
When H > Hub there are two common ways to mitigate this.
1. STOPPED TRAINING: Stop training when the error on a validation subset
increases for a specified number of epochs. The default in the NN Toolbox is
a. Automatic trn/val/tst subset data division in the ratios 0.7/0.15/0.15
b. A 6 epoch limit of continuous val subset error increases.
2. REGULARIZATION: Using a trn/val/tst subset data division in the ratio
0.85/0.0/0.15 with the training function TRAINBR that uses a "regularized"
error function consisting of a weighted sum of sum-squared-error and
sum-squared weights.
3. Other useful search terms in the NEWSGROUP, ANSWERS and the internet are
a. OVERFITTING
b. OVERTRAINING
Hope this helps.
Thank you for formally accepting my answer
Greg
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!