my dlgradient returns all "0"

Question

世锬 2024-3-18

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2095706-my-dlgradient-returns-all-0

回答： arushi 2024-9-10，5:39

The Net goes here

layers1 = [
    sequenceInputLayer([4 1 2],"Name","betaIn")
    convolution2dLayer([3 2],32,"Name","conv1_1","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu1_1")
    convolution2dLayer([3 1],64,"Name","conv1_2","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu1_2")
    maxPooling2dLayer([2 2],"Name","pool1")
    convolution2dLayer([3 2],128,"Name","conv2_1","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu2_1")
    convolution2dLayer([2 2],128,"Name","conv2_2","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu2_2")
    maxPooling2dLayer([2 2],"Name","pool2")
    convolution2dLayer([2 2],64,"Name","conv3_1","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu3_1")
    convolution2dLayer([3 3],32,"Name","conv3_2","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","relu3_2")
    convolution2dLayer([3 3],2,"Name","conv3_3","Padding",[1 1 1 1],"WeightL2Factor",0)
    reluLayer("Name","F")];
layers2 = [
    sequenceInputLayer([5 1 2],"Name","alpha")
    alphaMultiplyF("ComplexMultiply")
    ];
net=dlnetwork(layers1);
net=addLayers(net,layers2);
net=connectLayers(net,"F","ComplexMultiply/F");
net=initialize(net);
function [loss,gradients,state] = modelLoss(net,beta,alpha,T)
% Forward data through network.
[Y,state] = forward(net,beta,alpha);
% Calculate cross-entropy loss.
loss = mse(Y,T);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

arushi 2024-9-10，5:39

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2095706-my-dlgradient-returns-all-0#answer_1513704

When dlgradient returns zeros for all gradients, it usually indicates that the loss function's gradient with respect to the network parameters is zero everywhere. This can happen for a few reasons, including issues with the network architecture, the loss function, the data, or even how the gradients are being calculated. Here are a few steps you can take to debug the issue:

Inspect Learnables: Check net.Learnables to ensure it contains the parameters you expect.
Test Custom Layer: If possible, isolate and test your custom layer (alphaMultiplyF) to ensure it correctly computes forward and backward passes.
Simplify the Model: Temporarily simplify your model to a minimal version that should be capable of learning (e.g., remove some layers). This can help identify if a specific part of the network is causing the issue.
Check Outputs: Before calculating the loss, inspect the outputs of the network (Y) to ensure they're reasonable and not all zeros or NaNs.