Not able to calculate gradient of loss function in a neural network program

10 次查看(过去 30 天)
Hi,
I am trying to solve a phisics-informed neural network problem in which I constructed a loss function as follows
function [loss,gradients] = loss_fun(parameters,x,C,alpha)
% C is a complex-valued constant
% alpha is a real-valued constant
NN = model(parameters,x); % Feedforward neural network
f = C*NN; % Intermediate function
g = fxx+alpha*f; % Objective function
gr = real(g); % Real-part of g
gi = imag(g); % Imaginary-part of g
zeroTarget_r = zeros(size(gr),"like",gr); % Zero targets for the real-part
loss_r = l2loss(gr, zeroTarget_r); % Real-part loss function
zeroTarget_i = zeros(size(gi),"like",gi); % Zero targets for the imaginary-part
loss_i = l2loss(gi, zeroTarget_i); % Imaginary-part loss function
loss = loss_r+loss_i; % Total loss function (real-valued)
gradients = dlgradient(loss,parameters); % Loss function gradients with respect to parameters
end
The function 'model' returns a feedforward neural network . I would like the minimize the function g with respect to the parameters (θ). The input variable x as well as the parameters θ of the neural network are real-valued. Here, which is a double derivative of f with respect to x, is calculated as . The presence of complex-valued constant C makes the objective function g a complex-valued. Hence, I split it into real and imaginary parts, calculated individual loss functions and added them.
While calculating the gradients I am encountering the following error
"Encountered complex value when computing gradient with respect to an input to fullyconnect. Convert all inputs to fullyconnect to real".
I checked indivial loss values and the parameter values. They are purely real.
I would be grateful to you if you could tell possible reasons for the error and resolution steps.
I am using fmincon with lbfgs hessian approximation for the optimization.
  2 个评论
Richard
Richard 2023-5-17
I posted an answer regarding the complex value issue, but as an aside, you might be interested in the lbfgsupdate function which was recently added to Deep Learning Toolbox in R2023a.
Dr. Veerababu Dharanalakota
Thank you, Richard. Addition of lbfgsupdate function is a great help for the researchers working on physics-informed neural networks.

请先登录,再进行评论。

采纳的回答

Richard
Richard 2023-5-17
I think this may be due to your introduction of the complex value into the output of the model, NN. Even though you are later splitting this into two real halves, the gradient backwards computation will be stepping back through this (complex) C*(real) NN operation which reintroduces a complex gradient during the backwards.
Try calling NN = real(NN) before this step to insulate the real-valued model from the complex part of the calculation:
NN = model(parameters,x); % Feedforward neural network
NN = real(NN);
f = C*NN; % Intermediate function
It may seem counter-intuitive to apply this before the complex values are created, and indeed in the forwards computation this will have no effect because NN is already real. But in the backwards pass for gradients the computation flows in the other direction through the code, and the backwards for real(NN) will be after the backwards for C*NN. It will discard the imaginary parts of the gradient, which at this point have no meaning because there is no imaginary part of the NN value.
  4 个评论
Jast
Jast 2024-1-4
This answer is really appreciated. Thanks so much! It helped me during a late night debugging session!!!

请先登录,再进行评论。

更多回答(1 个)

Kartik
Kartik 2023-5-17
Hi,
The error message suggests that there is a complex value in the input to the fully connected layer of your neural network model. This could be due to the fact that the output of the intermediate function "f" includes a complex constant "C" multiplied by the neural network output "NN". If "C" is complex, then "f" will be complex-valued as well, and the subsequent computations involving "f" may introduce complex values.
To resolve this error and perform backpropagation through your neural network, you need to ensure that all inputs to the neural network are real-valued. One way to do this would be to separate the real and imaginary parts of the complex input to the fully connected layer, and pass them separately as inputs. You can do this by using the "real" and "imag" functions to extract the real and imaginary parts of "f" separately:
NN = model(parameters,x); % Feedforward neural network
f = C*NN; % Intermediate function
f_real = real(f); % Real part of f
f_imag = imag(f); % Imaginary part of f
fc_in = [f_real; f_imag]; % Concatenate f_real and f_imag
fc_out = fullyconnect(fc_in, weights_fc, bias_fc); % Fully connected layer output
Here, the "fc_in" matrix is formed by concatenating the real and imaginary parts of "f", and then passed to the fully connected layer.
Refer to the following MathWorks documentation for more information:
  3 个评论
Kartik
Kartik 2023-5-18
Yes, that can be a possible work around, if we seperate the real and imaginary parts and perform all the other calculations like loss calculation and gradient descent on them seperately.
Dr. Veerababu Dharanalakota
Okay. Splitting real and imaginary parts results in several loss functions which are to be optimized simultaneously. It may pose a problem during training. But, I will give a try and get back to you.

请先登录,再进行评论。

类别

Help CenterFile Exchange 中查找有关 Operations 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by