Unexpected loss reduction using custom training loop in Deep Learning Toolbox

Question

MathWorks Support Team 2023-7-19

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2004477-unexpected-loss-reduction-using-custom-training-loop-in-deep-learning-toolbox

回答： MathWorks Support Team 2023-8-3

I have created a custom training loop following the documentation example: https://www.mathworks.com/help/releases/R2023a/deeplearning/ug/train-network-using-custom-training-loop.html

However, since I use the same loss function for training and validation, I have altered the "modelloss" function so the "forward" function is outside of the function. For example:

[Y, state] = forward(net, X)
[loss,gradient] = dlfeval(@modelLoss,net,Y,T);
function [loss,gradients] = modelLoss(net,Y,T)
% Calculate cross-entropy loss.
loss = crossentropy(Y,T);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end

Now the resulting loss during training is not reducing as expected. How can I resolve this issue?

请先登录，再回答此问题。

Answer 1

MathWorks Support Team 2023-7-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2004477-unexpected-loss-reduction-using-custom-training-loop-in-deep-learning-toolbox#answer_1282982

When the "dlgradient" function is used inside a second function which is called by "dlfeval", automatic differentiation is used to calculate the gradients. The "dlfeval" function traces the operations when calculating the gradient and therefore, for the loss to be calculated correctly, the functions related to finding the gradient (e.g. "forward") must remain inside the "modelloss" function called by the "dlfeval" function.

Please refer to the following documentation page for more information on automatic differentiation in Deep Learning Toolbox:

https://www.mathworks.com/help/releases/R2023a/deeplearning/ug/deep-learning-with-automatic-differentiation-in-matlab.html

Moving the "forward" function back inside the "modelLoss" function will resolve the issue. Additionally, since, the gradient is not required for validation, using the "dlfeval" function to calculate the validation loss introduces unnecessary overhead and decreases performance.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Unexpected loss reduction using custom training loop in Deep Learning Toolbox

采纳的回答

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

Unexpected loss reduction using custom training loop in Deep Learning Toolbox

采纳的回答

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

更多回答（0 个）

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论