Physics-informed NN for parameter identification

Question

Zhe 2024-1-8

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2067501-physics-informed-nn-for-parameter-identification

评论： Ben 2024-5-20

Dear all,

I am trying to use the physics-informed neural network (PINN) for an inverse parameter identification for ODE or PDE.

I referenced the example in this link to write the code：https://ww2.mathworks.cn/matlabcentral/answers/2019216-physical-informed-neural-network-identify-coefficient-of-loss-function#answer_1312867

Here's the program I wrote:

clear; clc;
% Specify training configuration
numEpochs = 500000;
avgG = [];
avgSqG = [];
batchSize = 500;
lossFcn = @modelLoss;
lr = 1e-5;
% Inverse PINN for d2x/dt2 = mu1*x + mu2*x^2
mu1Actual = -rand;
mu2Actual = rand;
x = @(t) cos(sqrt(-mu1Actual)*t) + sin(sqrt(-mu2Actual)*t);
maxT = 2*pi/sqrt(max(-mu1Actual, -mu2Actual));
t = dlarray(linspace(0, maxT, batchSize), "CB");
xactual = dlarray(x(t), "CB");
% Specify a network and initial guesses for mu1 and mu2
net = [
    featureInputLayer(1)
    fullyConnectedLayer(100)
    tanhLayer
    fullyConnectedLayer(100)
    tanhLayer
    fullyConnectedLayer(1)];
params.net = dlnetwork(net);
params.mu1 = dlarray(-0.5);
params.mu2 = dlarray(0.5);
% Train
for i = 1:numEpochs
    [loss, grad] = dlfeval(lossFcn, t, xactual, params);
    [params, avgG, avgSqG] = adamupdate(params, grad, avgG, avgSqG, i, lr);
    if mod(i, 1000) == 0
        fprintf("Epoch: %d, Predicted mu1: %.3f, Actual mu1: %.3f, Predicted mu2: %.3f, Actual mu2: %.3f\n", ...
            i, extractdata(params.mu1), mu1Actual, extractdata(params.mu2), mu2Actual);
    end
end
function [loss, grad] = modelLoss(t, x, params)
    xpred = forward(params.net, t);
    dxdt = dlgradient(sum(real(xpred)), t, 'EnableHigherDerivatives', true);
    d2xdt2 = dlgradient(sum(dxdt), t);
    
    % Modify the ODE residual based on your specific ODE
    odeResidual = d2xdt2 - (params.mu1 * xpred + params.mu2 * xpred.^2);
    
    % Compute the mean square error of the ODE residual
    odeLoss = mean(odeResidual.^2);
    
    % Compute the L2 difference between the predicted xpred and the true x.
    dataLoss = l2loss(real(x), real(xpred));  % Ensure real part is used
    
    % Sum the losses and take gradients
    loss = odeLoss + dataLoss;
    [grad.net, grad.mu1, grad.mu2] = dlgradient(loss, params.net.Learnables, params.mu1, params.mu2);
end

When I run the script no errors are reported, but the two parameters learned are not getting closer to the true values as the number of iterations increases:

I would like to know the reason for this situation and the corresponding solution, if you can help me to change the code I will be very grateful!

3 个评论
显示 1更早的评论隐藏 1更早的评论

Ben 2024-1-9

It's not clear to me what the ODE you're looking at is. Since mu2Actual is positive, sqrt(-mu2Actual) is imaginary, and

is purely imaginary. If this is intentional then I suppose you will need to compute 2 ODEs in the loss, for the real and imaginary parts separately.

The given function

satisfies

, so

, and

, so

is not satisfied at

unless

.

In fact, writing

and

so that

we have that

and

for the given

. You can implement this as follows:

clear; clc;

% Specify training configuration

numEpochs = 500000;

avgG = [];

avgSqG = [];

batchSize = 500;

lossFcn = dlaccelerate(@modelLoss); % accelerate the loss function

lr = 1e-3; % NOTE: I increased this, but 1e-5 may be fine too.

% Inverse PINN for d2u/dt2 = mu1*u, d2v/dt2 = mu2*v, x = u + iv

mu1Actual = -rand;

mu2Actual = rand;

x = @(t) cos(sqrt(-mu1Actual)*t) + sin(sqrt(-mu2Actual)*t);

maxT = 2*pi/sqrt(max(-mu1Actual, -mu2Actual));

t = dlarray(linspace(0, maxT, batchSize), "CB");

xactual = dlarray(x(t), "CB");

% Specify a network and initial guesses for mu1 and mu2

net = [

featureInputLayer(1)

fullyConnectedLayer(100)

tanhLayer

fullyConnectedLayer(100)

tanhLayer

fullyConnectedLayer(2)]; % make the network have two outputs, u(t) and v(t)

params.net = dlnetwork(net);

params.mu1 = dlarray(-0.5);

params.mu2 = dlarray(0.5);

% Train

for i = 1:numEpochs

[loss, grad] = dlfeval(lossFcn, t, xactual, params);

[params, avgG, avgSqG] = adamupdate(params, grad, avgG, avgSqG, i, lr);

if mod(i, 100) == 0

fprintf("Epoch: %d, Predicted mu1: %.3f, Actual mu1: %.3f, Predicted mu2: %.3f, Actual mu2: %.3f\n", ...

i, extractdata(params.mu1), mu1Actual, extractdata(params.mu2), mu2Actual);

end

function [loss, grad] = modelLoss(t, x, params)

xpred = forward(params.net, t);

xpred = real(xpred); % this real call prevents complex gradients propagating backward into params.net

u = xpred(1,:);

v = xpred(2,:);

dudt = dlgradient(sum(u), t, 'EnableHigherDerivatives', true);

d2udt2 = dlgradient(sum(dudt), t);

dvdt = dlgradient(sum(v), t, EnableHigherDerivatives = true);

d2vdt2 = dlgradient(sum(dvdt),t);

% Modify the ODE residual based on your specific ODE

odeResidual = (d2udt2 - (params.mu1 * u)).^2;% + params.mu2 * xpred.^2);

odeResidual = odeResidual + (d2vdt2 - params.mu2.*v).^2;

% Compute the mean square error of the ODE residual

odeLoss = mean(odeResidual,"all");

% Compute the L2 difference between the predicted xpred and the true x.

dataLoss = l2loss(real(x), u) + l2loss(imag(x), v); % Ensure real part is used

% Sum the losses and take gradients

loss = odeLoss + dataLoss;

[grad.net, grad.mu1, grad.mu2] = dlgradient(loss, params.net.Learnables, params.mu1, params.mu2);

end

I found this worked reasonably in one instance after about 20,000 epochs. Curiously the predicted

became negative for sometime before it became positive and close to the true value.

carlos Hernando 2024-5-19

编辑：carlos Hernando 2024-5-20

在 MATLAB Online 中打开

Hello @Ben, I tried a similar code with lbfgsupdate, but I am unable to make it work. Any suggestion? Thank you for your time

% Sample X in (0.001, 1.001) x (0.001, 1.001). The 0.001 is to avoid t=0.
X = dlarray(0.001+rand(dimension,batchSize),"CB");
uactual = solutionFcn(X);
% lossFcn = dlaccelerate(@modelLoss);
lossFcn = @(parameters) dlfeval(@modelLoss,X,parameters,uactual);
solverState = lbfgsState;
for iter = 1:maxIters
    % [loss,gradients] = dlfeval(lossFcn,X,parameters,uactual);
    % fprintf("Iteration : %d, Loss : %.4f \n",iter, extractdata(loss));
    [parameters, solverState] = lbfgsupdate(parameters,lossFcn,solverState);
    % [parameters,avgG,avgSqG] = adamupdate(parameters,gradients,avgG,avgSqG,iter,1e-3);
    if mod(iter, 1000) == 0
        fprintf("Iteration: %d, Predicted c: %.3f, Actual c: %.3f \n", ...
            iter, extractdata(parameters.c), trueC);
    end
end

Edit: Adapt code to example

Ben 2024-5-20

Could you post your modelLoss and solutionFcn and how parameters are created, as these seem to be different to the above example.

The example code I posted before can be adapted to use lbfgsupdate as shown below. However it seemed you have to be careful with the LBFGS parameters - see the lbfgsupdate and lbfgsState documentation for all the options - for example the state.LineSearchStatus would often be "failed" in some experiments I tried. In such cases it could help to first train with adamupdate until you have reasonable convergence, then continue training with lbfgsupdate from that point.

clear; clc;

% Specify training configuration

numEpochs = 1000;

avgG = [];

avgSqG = [];

batchSize = 500;

lossFcn = dlaccelerate(@modelLoss); % accelerate the loss function

lr = 1e-3; % NOTE: I increased this, but 1e-5 may be fine too.

% Inverse PINN for d2u/dt2 = mu1*u, d2v/dt2 = mu2*v, x = u + iv

mu1Actual = -rand;

mu2Actual = rand;

x = @(t) cos(sqrt(-mu1Actual)*t) + sin(sqrt(-mu2Actual)*t);

maxT = 2*pi/sqrt(max(-mu1Actual, -mu2Actual));

t = dlarray(linspace(0, maxT, batchSize), "CB");

xactual = dlarray(x(t), "CB");

% Specify a network and initial guesses for mu1 and mu2

net = [

featureInputLayer(1)

fullyConnectedLayer(100)

tanhLayer

fullyConnectedLayer(100)

tanhLayer

fullyConnectedLayer(2)]; % make the network have two outputs, u(t) and v(t)

params.net = dlnetwork(net);

params.mu1 = dlarray(-0.5);

params.mu2 = dlarray(0.5);

fcn = @(params) dlfeval(@modelLoss, t, xactual, params);

fcn = dlaccelerate(fcn);

state = lbfgsState;

% Train

for i = 1:numEpochs

[params,state] = lbfgsupdate(params,fcn,state,"MaxNumLineSearchIterations",50,"LineSearchMethod","backtracking");

if mod(i, 10) == 0

fprintf("Epoch: %d, Predicted mu1: %.3f, Actual mu1: %.3f, Predicted mu2: %.3f, Actual mu2: %.3f\n", ...

i, extractdata(params.mu1), mu1Actual, extractdata(params.mu2), mu2Actual);

end

function [loss, grad] = modelLoss(t, x, params)

xpred = forward(params.net, t);

xpred = real(xpred); % this real call prevents complex gradients propagating backward into params.net

u = xpred(1,:);

v = xpred(2,:);

dudt = dlgradient(sum(u), t, 'EnableHigherDerivatives', true);

d2udt2 = dlgradient(sum(dudt), t);

dvdt = dlgradient(sum(v), t, EnableHigherDerivatives = true);

d2vdt2 = dlgradient(sum(dvdt),t);

% Modify the ODE residual based on your specific ODE

odeResidual = (d2udt2 - (params.mu1 * u)).^2;% + params.mu2 * xpred.^2);

odeResidual = odeResidual + (d2vdt2 - params.mu2.*v).^2;

% Compute the mean square error of the ODE residual

odeLoss = mean(odeResidual,"all");

% Compute the L2 difference between the predicted xpred and the true x.

dataLoss = l2loss(real(x), u) + l2loss(imag(x), v); % Ensure real part is used

% Sum the losses and take gradients

loss = odeLoss + dataLoss;

[grad.net, grad.mu1, grad.mu2] = dlgradient(loss, params.net.Learnables, params.mu1, params.mu2);

end

请先登录，再进行评论。

请先登录，再回答此问题。

Physics-informed NN for parameter identification

3 个评论
显示 1更早的评论隐藏 1更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Physics-informed NN for parameter identification

3 个评论 显示 1更早的评论隐藏 1更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

WeChat

3 个评论
显示 1更早的评论隐藏 1更早的评论