How does the L2 Regularization in a custom training loop work?

Question

hanspeter 2024-9-26

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2155645-how-does-the-l2-regularization-in-a-custom-training-loop-work

回答： Richard 2024-9-26

Hi,

I implemented the custom training loop to train a sequence to sequence regression model. I also implemented the L2 regularization as described in the documentation here: https://de.mathworks.com/help/deeplearning/ug/specify-training-options-in-custom-training-loop.html#mw_50581933-e0ce-4670-9456-af23b2b6f337

Now I'm wondering how this works. If I have a look in other documentations like this one from Google, it seems to work differently. Google describes it as adding the square of the weights to the loss. In Matlab it looks like I add the weights to the gradients. Isn't that something different? Is one way better than the other?

Cheers

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Richard 2024-9-26

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2155645-how-does-the-l2-regularization-in-a-custom-training-loop-work#answer_1523120

Short story: these are the same.

Long story: The link you gave explains the underlying mathematics of L2 regularization and its effects on weights. However it doesn't really explain the mechanics of how adding a regularization to the loss affects the minimization algorithm - it stops at saying you "minimize(loss + lambda*complexity)".

In this case the minimization is being done by taking steps based on the gradient of the total loss with respect to each weight. For each weight parameter the d(loss + lambda*complexity)/dw that is required is equal to dL/dw + d(lambda*complexity)/dw, i.e. the gradients are the sum of the non-regularized gradient and a regularization term. It is this sum which the MATLAB example is performing.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How does the L2 Regularization in a custom training loop work?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

How does the L2 Regularization in a custom training loop work?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论