Curve fitting and parameter estimation with lsqcurvefit

Question

Mr. P 2024-5-10

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2117481-curve-fitting-and-parameter-estimation-with-lsqcurvefit

评论： Star Strider 2024-5-11

Dear Matlab family...

I have been following previous chats about this topic but when I try to relate with my model I get the error below. Note that, to improve the accuacy of the model, not everything needs to be estimated from the dataset, some aspects like natural mortality can be obtained from literature and government records, so in this case, I want to pass mu and Lambda as I already obtained them from literature, and only estimate the other parameters. I have attached my matlab script and my data for your convinience. Thank you very much for your assistance. As this is experimental, feel free to vary the initial conditions and boundaries for the parameters.

Error using lsqcurvefit>initEvalErrorHandler
FUN must have two input arguments.
Error in lsqcurvefit (line 258)
            userFcn_ME = initEvalErrorHandler(userFcn_ME,funfcn_x_xdata{3}, ...
Error in fit_run (line 24)
estimated_params = lsqcurvefit(objective, initial_guess, lb, ub);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Torsten 2024-5-10

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2117481-curve-fitting-and-parameter-estimation-with-lsqcurvefit#answer_1455766

编辑：Torsten 2024-5-10

在 MATLAB Online 中打开

% Load data
data = readtable('infections.csv');
dates = datetime(data.Date, 'InputFormat', 'dd-MMM-yy');
cases = data.Cases;
% Define time vector
t = 1:numel(cases);
% Initial guess for parameters
initial_guess = [0.5, 0.1, 0.1];
% Define lower and upper bounds
lb = [0, 0, 0];
ub = [1, 1, 1]; 
% Call lsqcurvefit
estimated_params = lsqcurvefit(@objective, initial_guess, t, cases, lb, ub);
disp(estimated_params);
function I = objective(params,t)
  S0 = ...
  E0 = ...
  I0 = ...
  R0 = ...
  [~,Y] = ode45(@(t,y)fit(t,y,params),t,[S0;E0;I0;R0]);
  I = Y(:,3);
end
function dydt = fit(t, y, params)
  S = y(1);
  E = y(2);
  I = y(3);
  R = y(4);
    
  beta = params(1);
  sigma = params(2);
  gamma = params(3);
  Lambda=1000;
  mu=0.000065;
    
  dS = Lambda-beta*S*I-mu*S;
  dE = beta*S*I - sigma*E-mu*E;
  dI = sigma*E - gamma*I-mu*I;
  dR = gamma*I-mu*R;
  dydt = [dS; dE; dI; dR];
end

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

Star Strider 2024-5-10

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2117481-curve-fitting-and-parameter-estimation-with-lsqcurvefit#answer_1455791

在 MATLAB Online 中打开

infections.csv

If you look at your data, I do not believe that there is any way to use your ‘SEIR’ model to fit them —

T1 = readtable('infections.csv')

T1 = 1017x2 table

Date Cases ___________ _____ 01-Jan-0020 1 02-Jan-0020 2 03-Jan-0020 0 04-Jan-0020 0 05-Jan-0020 0 06-Jan-0020 0 07-Jan-0020 0 08-Jan-0020 2 09-Jan-0020 2 10-Jan-0020 0 11-Jan-0020 0 12-Jan-0020 1 13-Jan-0020 0 14-Jan-0020 1 15-Jan-0020 0 16-Jan-0020 0

figure

plot(T1{:,1}, T1{:,2})

grid

xlabel('Date')

ylabel('Infections')

That aside, I wrote code using both lsqcurvefit (thatt fails because I serously doube any parameter set will work with your code) andusing the genetic algorithm (ga) that you can use if you have the Global Optimization Toolbox. Thd data span three yeaars, from 20 to 22, so the date data on the x-axis are not accurate, although the plot itself is. I created a linear daily date vector ‘t’ in my code that you can use if you want to.

My code for your problem —

clear all

close all

T1 = readtable('infections.csv')

DOY = day(T1.Date,'dayofyear');

didx = [diff(DOY)<0];

de = DOY(didx~=0);

di = cumsum([0; didx]);

di(di==1) = DOY(di == 1) + de(1);

di(di==2) = DOY(di == 2) + de(1) + de(2);

% size(di)

% size(T1)

figure

plot(di, T1{:,2})

grid

% return

t = di;

c = T1{:,2};

% theta0 = rand(7,1).*[ones(3,1)*1E+5; rand(4,1)];

% tic

% % [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c,zeros(size(theta0)));

% toc

ftns = @(theta) norm(c-kinetics(theta,t));

PopSz = 500;

Parms = 7;

optsAns = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms).*[ones(1,3) ones(1,4)*1E-3], 'MaxGenerations',5E3, 'FunctionTolerance',1E-10); % Options Structure For 'Answers' Problems

% opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);

% optshf = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'HybridFcn',@fmincon, 'PlotFcn',@gaplotbestf, 'PlotInterval',1); % With 'HybridFcn'

t0 = clock;

fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)

[theta,fval,exitflag,output,population,scores] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],optsAns);

t1 = clock;

fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)

GA_Time = etime(t1,t0)

DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');

fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)

fprintf('Fitness value at convergence = %.4f\nGenerations \t\t\t\t = %d\n\n',fval,output.generations)

fprintf(1,'\tRate Constants:\n')

for k1 = 1:numel(theta)

fprintf(1, '\t\tTheta(%2d) = %8.5f\n', k1, theta(k1))

end

tv = linspace(min(t), max(t));

Cfit = kinetics(theta, tv);

figure

hd = plot(t, c, 'p');

for k1 = 1:size(c,2)

CV(k1,:) = hd(k1).Color;

hd(k1).MarkerFaceColor = CV(k1,:);

end

hold on

hlp = plot(tv, Cfit);

for k1 = 1:size(c,2)

hlp(k1).Color = CV(k1,:);

end

hold off

grid

xlabel('Time')

ylabel('Concentration')

legend(hlp, compose('C_%d',1:size(c,2)), 'Location','best')

function C=kinetics(params,t)

c0=params(end-3:end);

[T,Cv]=ode15s(@DifEq,t,c0);

function dydt = DifEq(t, y)

S = y(1);

E = y(2);

I = y(3);

R = y(4);

beta = params(1);

sigma = params(2);

gamma = params(3);

Lambda=1000;

mu=0.000065;

dS = Lambda-beta*S*I-mu*S;

dE = beta*S*I - sigma*E-mu*E;

dI = sigma*E - gamma*I-mu*I;

dR = gamma*I-mu*R;

dydt = [dS; dE; dI; dR];

end

C=Cv(:,3);

end

I assumed that the ‘Cases’ were likely described by ‘dI’ so I chose the third column of the output of the integration to fit them. I also chaged the integrator to ode15s because your data are likely ‘stiff’ as are systems of differential equations describing most kinetic data.

My code runs, however it doesn’t converge because your model likely doesn’t describe your data.

.

2 个评论
显示无隐藏无

Mr. P 2024-5-10

在 MATLAB Online 中打开

infections1.csv

@Star Strider Thank you for your response. I have adjusted my data so that it doesn't have the seasonality pattern, here is my adjusted data (infections1). However, when I run the code you provided above with this data, I get the error:

Index exceeds the number of array elements. Index must not exceed 0.
Error in fit_run (line 8)
di(di==1) = DOY(di == 1) + de(1);

Star Strider 2024-5-11

在 MATLAB Online 中打开

infections1.csv

I seriously doubt if my code (and your sifferential equation system) are going to work, regardless. Ths eimple reason is that they are not set up to model the sort of data you are giving them.

If you already have an identified model, you could probably (no promises) estimate the ‘S’, ‘E’, and ‘R’ data from the available ‘I’ data, however fitting those equations to your data is likely not possible, simply because your differential equations describe an entirely different situation than are present in your data. See for example The SEIRS model for infectious disease dynamics Nat Methods 17, 557–558 (2020). The newer data do not change that.

clear all

close all

T1 = readtable('infections1.csv')

T1 = 145x2 table

Date Cases ___________ _____ 09-Mar-0020 76 10-Mar-0020 17 11-Mar-0020 0 12-Mar-0020 25 13-Mar-0020 4 14-Mar-0020 25 15-Mar-0020 3 16-Mar-0020 16 17-Mar-0020 15 18-Mar-0020 28 19-Mar-0020 14 20-Mar-0020 3 21-Mar-0020 5 22-Mar-0020 27 23-Mar-0020 6 24-Mar-0020 12

DOY = day(T1.Date,'dayofyear');

% didx = [diff(DOY)<0];

% de = DOY(didx~=0)

% di = cumsum([0; didx])

% di(di==1) = DOY(di == 1) + de(1);

% di(di==2) = DOY(di == 2) + de(1) + de(2);

% size(di)

% size(T1)

di = DOY - DOY(1);

figure

plot(di, T1{:,2})

grid

xlabel('Date')

ylabel('Infections')

return

t = di;

c = T1{:,2};

% theta0 = rand(7,1).*[ones(3,1)*1E+5; rand(4,1)];

% tic

% % [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c,zeros(size(theta0)));

% toc

ftns = @(theta) norm(c-kinetics(theta,t));

PopSz = 500;

Parms = 7;

optsAns = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms).*[ones(1,3) ones(1,4)*1E-3], 'MaxGenerations',5E3, 'FunctionTolerance',1E-10); % Options Structure For 'Answers' Problems

% opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);

% optshf = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'HybridFcn',@fmincon, 'PlotFcn',@gaplotbestf, 'PlotInterval',1); % With 'HybridFcn'

t0 = clock;

fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)

[theta,fval,exitflag,output,population,scores] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],optsAns);

t1 = clock;

fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)

GA_Time = etime(t1,t0)

DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');

fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)

fprintf('Fitness value at convergence = %.4f\nGenerations \t\t\t\t = %d\n\n',fval,output.generations)

fprintf(1,'\tRate Constants:\n')

for k1 = 1:numel(theta)

fprintf(1, '\t\tTheta(%2d) = %8.5f\n', k1, theta(k1))

end

tv = linspace(min(t), max(t));

Cfit = kinetics(theta, tv);

figure

hd = plot(t, c, 'p');

for k1 = 1:size(c,2)

CV(k1,:) = hd(k1).Color;

hd(k1).MarkerFaceColor = CV(k1,:);

end

hold on

hlp = plot(tv, Cfit);

for k1 = 1:size(c,2)

hlp(k1).Color = CV(k1,:);

end

hold off

grid

xlabel('Time')

ylabel('Concentration')

legend(hlp, compose('C_%d',1:size(c,2)), 'Location','best')

function C=kinetics(params,t)

c0=params(end-3:end);

[T,Cv]=ode15s(@DifEq,t,c0);

function dydt = DifEq(t, y)

S = y(1);

E = y(2);

I = y(3);

R = y(4);

beta = params(1);

sigma = params(2);

gamma = params(3);

Lambda=1000;

mu=0.000065;

dS = Lambda-beta*S*I-mu*S;

dE = beta*S*I - sigma*E-mu*E;

dI = sigma*E - gamma*I-mu*I;

dR = gamma*I-mu*R;

dydt = [dS; dE; dI; dR];

end

C=Cv(:,3);

end

.

请先登录，再进行评论。

Curve fitting and parameter estimation with lsqcurvefit

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（2 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无

另请参阅

类别

标签

Community Treasure Hunt

Curve fitting and parameter estimation with lsqcurvefit

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（2 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

2 个评论 显示 无隐藏 无

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无