my dlgradient returns all "0"

13 次查看(过去 30 天)
世锬
世锬 2024-3-18
回答: arushi 2024-9-10,5:39
The Net goes here
layers1 = [
sequenceInputLayer([4 1 2],"Name","betaIn")
convolution2dLayer([3 2],32,"Name","conv1_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu1_1")
convolution2dLayer([3 1],64,"Name","conv1_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu1_2")
maxPooling2dLayer([2 2],"Name","pool1")
convolution2dLayer([3 2],128,"Name","conv2_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu2_1")
convolution2dLayer([2 2],128,"Name","conv2_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu2_2")
maxPooling2dLayer([2 2],"Name","pool2")
convolution2dLayer([2 2],64,"Name","conv3_1","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu3_1")
convolution2dLayer([3 3],32,"Name","conv3_2","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","relu3_2")
convolution2dLayer([3 3],2,"Name","conv3_3","Padding",[1 1 1 1],"WeightL2Factor",0)
reluLayer("Name","F")];
layers2 = [
sequenceInputLayer([5 1 2],"Name","alpha")
alphaMultiplyF("ComplexMultiply")
];
net=dlnetwork(layers1);
net=addLayers(net,layers2);
net=connectLayers(net,"F","ComplexMultiply/F");
net=initialize(net);
function [loss,gradients,state] = modelLoss(net,beta,alpha,T)
% Forward data through network.
[Y,state] = forward(net,beta,alpha);
% Calculate cross-entropy loss.
loss = mse(Y,T);
% Calculate gradients of loss with respect to learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end

回答(1 个)

arushi
arushi 2024-9-10,5:39
When dlgradient returns zeros for all gradients, it usually indicates that the loss function's gradient with respect to the network parameters is zero everywhere. This can happen for a few reasons, including issues with the network architecture, the loss function, the data, or even how the gradients are being calculated. Here are a few steps you can take to debug the issue:
  • Inspect Learnables: Check net.Learnables to ensure it contains the parameters you expect.
  • Test Custom Layer: If possible, isolate and test your custom layer (alphaMultiplyF) to ensure it correctly computes forward and backward passes.
  • Simplify the Model: Temporarily simplify your model to a minimal version that should be capable of learning (e.g., remove some layers). This can help identify if a specific part of the network is causing the issue.
  • Check Outputs: Before calculating the loss, inspect the outputs of the network (Y) to ensure they're reasonable and not all zeros or NaNs.
Hope it helps!

类别

Help CenterFile Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息

产品


版本

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by