Deep learning using CNN - question about training.

Question

0 个投票

I have a question about a CNN I'm training. during the training everything goes well, with high accuracy results on training and validation sets. however once I stop training the model, the final model has much much inferior accuracy returned to me (see the image below).

Now I have two questions: 1) obviously, why does it happen? I'm expecting to get a model with on par performance on both training and validation sets, with that of the plot. 2) After I manually stop the training process to finish the training at a certain point, it takes quite a long time for the model to be returned (anywhere between 5 to 50 mins w.r.t the size of the data and other parameters such as depth of the network and etc.). Why does this happen?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Johannes Bergstrom 2018-11-26

0 个投票

Does you network use batch normalization layers?

After training finishes, trainNetwork loops thorugh the whole data set to calculate the batch normalization statistics required to create a network ready for prediction. This answers your question 2) - why it takes so long.

But also, you train for much less than one epoch, which means that the data seen so far by the network might not be representative of the training nor validation sets.

Try making sure that you

1. Shuffle the training data (see 'Shuffle')

2. Train for multiple epochs

Finally, looking at the large 'bump' in the training loss for the very first 10 iterations, it seems like your learning rate is too high.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Deep learning using CNN - question about training.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

标签

Community Treasure Hunt

Deep learning using CNN - question about training.

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论