Parameter estimation for a system of differential equations with multiple time spans

8 次查看（过去 30 天）

显示更早的评论

Jakob24 2021-3-19

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/777627-parameter-estimation-for-a-system-of-differential-equations-with-multiple-time-spans

评论： Jakob24 2021-4-19

采纳的回答： Star Strider

在 MATLAB Online 中打开

Hello,

I've got four sets of experimental data with variables c1(t1), c2(t2), c3(t3) and c4(t4), each with their own time scale that is not common for all.

I'm interested in estimating two parameters, x1 and x2, of a model so that the parameters are the best fit for the four data sets, i.e. I want to fit one x1 and one x2 for all data-sets together, not one set of x per data set.

I've successfully managed to apply this to all four data sets (i.e. 4 ODEs) that are called in lsqcurvefit.

However, I can only manage to do this with one time vector with length equal to the longest C-vector.

This is not what I want, since the experimental data is dependent on four different time scales, the lsqcurvefit will fit one of the data sets good, while the remaining thee become erroneous.

So, is it possible to solve for and fit based on each time series?

% What I've got currently:
c = [c1, c2, c3, c4]
t = t1
% but I want:
c = [c1,c2,c3,c4]
t = [t1 t2 t3 t4]
% which is then called in:
x = lsqcurvefit(@one_ODE,x0,t,c)
% where ODE is a call to a function that solves one ODE per concentration
% (4 in total), that has another function nested in it, e.g.:
function C = one_ODE(x,t) 
c0 = [54 0 27 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c) 
dcdt = zeros(4,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(2) =   ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4))); 
dcdt(4) =   ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dC = dcdt;
end
    C = Cv; % 
end

Thanks in advance.

Kind regards,

Jakob

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

采纳的回答

Star Strider 2021-3-19

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/777627-parameter-estimation-for-a-system-of-differential-equations-with-multiple-time-spans#answer_652587

在 MATLAB Online 中打开

This is definitely a new problem!

The usual way of dealing with it would be to vertically concatenate the time vectors and objective function matrices, and present all of them to lsqcurvefit at once, since lsqcurvefit ‘doesn’t care’ about the order of its inputs so long as they match.

I would be tempted to try something like this:

data{1} = readmatrix('File1.ext');
%               . . .
data{4} = readmatrix('File4.ext');
function Cmat = cat_ODE(data)
for k = 1:4
    t = data{k}(:,1);
    Ccat{k,:} = oneODE(x,t);
end
Cmat = cell2mat(Ccat);
end
for k = 1:4
    tcat = data{k}(:,1);
    ycat{k,:} = data{k}(:,2:5);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
    
lsqcurvefit(@cat_DDE, x0, tv, ydata)

This assumes the the time vector is in the first column of the data sets, and that the other information are in the last four columns.

That is the only approach I can see to this problem. It would obviously be necessary to experiment with it.

84 个评论
显示 82更早的评论隐藏 82更早的评论

Jakob24 2021-3-19

编辑：Jakob24 2021-3-19

在 MATLAB Online 中打开

@Star Strider, thank you for above.

I've got a few questions. Why are we calling cat_ODE in lsqcurvefit? Should it not be "one_ODE", as specified in the function file?

I'll post my code and a small excel sheet with random data and you can see what I've tried doing. I've used 2 datasets here only and will adapt it later to 4.

%% Function file:
function C = one_ODE(x,t)
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c) 
dcdt = zeros(4,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(2) =   ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4))); 
dcdt(4) =   ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dC = dcdt;
end
    C = Cv(:,[2,4]);
end
% Separate script file:
clear; clc; close all;
data{1} = readmatrix('File1');
data{2} = readmatrix('File2');
for k = 1:2
    tcat = data{k}(:,1);
    ycat{k,:} = data{k}(:,1:2);
end
t = cell2mat(tcat);
c = cell2mat(ycat);
    
x0 = [1e-4 1e-2]; 
x = lsqcurvefit(@cat_ODE, x0, t, c)
function Cmat = cat_ODE(data)
for k = 1:2
    t = data{k}(:,1);
    Ccat{k,:} = one_ODE(x,t);
end
Cmat = cell2mat(Ccat);
end
lsqcurvefit(@cat_ODE, x0, t, c) % Why are we calling cat_ODE?

edit: sorted out typo

Now with error: "Brace indexing is not supported for variables of this type."

cellclass = class(c{1});

Error in one_script (line 11)

t = cell2mat(tcat);

Kind regards,

Jakob

Star Strider 2021-3-19

在 MATLAB Online 中打开

This runs without error (except for occasional problems with the initial random parameter estimates), however the fit is definitely not good. That problem may be in the differential equations themselves, so they may need revision.

The Code —

data{1} = readmatrix('File1.xlsx');
data{2} = readmatrix('File2.xlsx');
datalen = cellfun(@(x)size(x,1),data);
function C = one_ODE(x,t,k) 
c0 = [54 0 27 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c) 
dcdt = zeros(4,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(2) =   ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4))); 
dcdt(4) =   ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dC = dcdt;
end
    C = Cv(:,2*k); % 
end
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
    idx = 1:datalen(k);
    if k>1
        idx = datalen(k-1)+(1:datalen(k));
    end
    t = data(idx,1);
    Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
for k = 1:numel(data)
    tcat{k,:} = data{k}(:,1);
    ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
x0 = rand(2,1)*10;
B = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata)
figure
hold on
for k = 1:numel(datalen)
    plot(data{k}(:,1), data{k}(:,2),'.')
    plot(data{k}(:,1), one_ODE(B,data{k}(:,1),k), '-')
end
hold off
grid

It vertically concatensates the original data vectors, then uses ‘datalen’ to select the rows to be fitted in each iteration. (It also uses the size of ‘datalen’ itself to determine ‘k’.)

Since ‘k’ iterates from 1 to 2, this assignment in ‘one_ODE’:

C = Cv(:,2*k); %

determines the columns to be returned to lsqcurvefit in each iteration.

The code appears to work as designed, however the differential equations do not appear to be able to fit the data.

Star Strider 2021-3-19

编辑：Star Strider 2021-3-19

在 MATLAB Online 中打开

I only got positive values and positive parameters in an otherwise unconstrained estimation. Part of the problem is that the two data sets are significantly different, so attempting to use one set of parameters to fit two obviously different data sets is likely not going to be optimal.

I suspect the parameter estimates from fitting the two data sets would be significantly different as well, and that one set of parameters would not work for both data sets. It might be worthwhile to consider using one of the Global Optimization Toolbox functions to search the entire parameter space for the best set of parameters, that dould fit both data sets, however even that may be overly optimistic. Since the initial parameter estimates could be the problem, what should the parameter ranges be, considering that you’ve fitted the two sets individually? Nonlinear parameter estimation routines and algorithms are extremely sensitive to the initial estimates, so appropriate initial estimates are important.

That aside, my code appears to do what was requested of it.

EDIT — (19 Mar 2021 at 18:16)

I was actually able to get a decent parameter fit with these parameters:

B =
   201.1526e-006
     7.8527e-003

producing this plot —

So I am now satisfied that my code works correctly and that it does what it is supposed to do!

Star Strider 2021-3-19

在 MATLAB Online 中打开

As always, my pleasure!

Adapting it to a number of data sets would likely require only one change in ‘cat_ODE’, specifically with respect to the ‘idx’ variable:

function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
    idx = 1:datalen(k);
    if k>1
        idx = sum(datalen(1:k-1))+(1:datalen(k)));
    end
    t = data(idx,1);
    Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end

That works with the present data set. I cannot test it on larger data sets. If the data are stored as a cell array (as I did nere), the rest should work. That part of the code is designed so that the appropriate index range for each part of the data set addresses the correct range in the concatenated independent and dependent data vectors.

It really is as simple as using linspace to create a new time vector. This creates a new time vector (here ‘tvct’) for each iteration of the plotting loop:

figure
hold on
for k = 1:numel(datalen)
    plot(data{k}(:,1), data{k}(:,2),'.')
    tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
    plot(tvct, one_ODE(B,tvct,k), '-')
end
hold off
grid

producing:

This one has 150 elements, change that to get more or fewer points.

Jakob24 2021-3-19

在 MATLAB Online 中打开

Attaching "File3" as an additional input and pasting my updated code below:

clear; clc; close all
% Vertically concatenate original data:
data{1} = readmatrix('File1.xlsx'); 
data{2} = readmatrix('File2.xlsx'); 
data{3} = readmatrix('File3.xlsx'); 
datalen = cellfun(@(x)size(x,1),data); % uses datalen to select rows to be fitted in each iteration and to determine the size of "k"
for k = 1:numel(data) %
    tcat{k,:} = data{k}(:,1); 
    ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat); 
ydata = cell2mat(ycat); 
x0 = [1e-4, 1e-2]; % initial value guesses
B = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata) 
figure
hold on
for k = 1:numel(datalen) 
    plot(data{k}(:,1), data{k}(:,2),data{k}(:,3)'.')
    tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
    plot(tvct, one_ODE(B,tvct,k), '-')
end
hold off
grid
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
    idx = 1:datalen(k);
    if k>1
        idx = datalen(1:k-1)+(1:datalen(k));
    end
    t = data(idx,1);
    Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
function C = one_ODE(x,t,k) 
c0 = [54 0 27 0 13.5 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c) 
dcdt = zeros(6,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(2) =   ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4))); 
dcdt(4) =   ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(5) = - ((x(1).*c(5)) + (x(2).*c(5).*c(6))); 
dcdt(6) =   ((x(1).*c(5)) + (x(2).*c(5).*c(6)));
dC = dcdt;
end
    C = Cv(:,3*k); % 
end

Star Strider 2021-3-19

在 MATLAB Online 中打开

The error the code throws is not related to the dimensions but to the differential equations, specifically the Warning that ode45 throws is:

Warning: Failure at t=4.257197e+00.  Unable to meet integration tolerances without
reducing the step size below the smallest value allowed (1.421085e-14) at time t. 

and as the result, lsqcurvefit throws:

Arrays have incompatible sizes for this operation.

The code works, however ode45 encounters a singularity in the differential equations, and stops, creating the dimension incompatibility.

My current code:

data{1} = readmatrix('File1.xlsx');
data{2} = readmatrix('File2.xlsx');
data{3} = readmatrix('File3.xlsx');
datalen = cellfun(@(x)size(x,1),data);
function C = one_ODE(x,t,k) 
c0 = [54 0 27 0 13.5 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c) 
dcdt = zeros(6,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(2) =   ((x(1).*c(1)) + (x(2).*c(1).*c(2))); 
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4))); 
dcdt(4) =   ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(5) = - ((x(1).*c(5)) + (x(2).*c(5).*c(6))); 
dcdt(6) =   ((x(1).*c(5)) + (x(2).*c(5).*c(6)));
dC = dcdt;
end
    C = Cv(:,2*k); % 
end
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
    idx = 1:datalen(k);
    if k>1
        idx = sum(datalen(1:k-1))+(1:datalen(k));
    end
    t = data(idx,1);
    Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
for k = 1:numel(data)
    tcat{k,:} = data{k}(:,1);
    ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
x0 = rand(2,1)*0.000001;
B = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata)
figure
hold on
for k = 1:numel(datalen)
    plot(data{k}(:,1), data{k}(:,2),'.')
    tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
    plot(tvct, one_ODE(B,tvct,k), '-')
end
hold off
grid

However, reducing ‘x0’ further:

x0 = rand(2,1)*0.000001;

results in:

B =
   168.7523e-006
     8.2923e-003

and this plot —

And my code proves to be robust to additional data sets!

Star Strider 2021-3-20

在 MATLAB Online 中打开

‘I'm guessing it has to do with fitting several datasets that differ by several orders of magnitude’

Correct. Using ode15s is a good choice (ode23s would also work) since the parameters now vary by several orders-of-magnitude. (This is where the Global Optimization Toolbox functions would be appropriate if additional data sets make the fitting less than straightforward.)

‘Is it possible to insert a function handle here to omit the legend entries of the raw data?’

Yes! Although not a function handle, instead a handle to the line object.

Change the figure loop to:

figure
hold on
for k = 1:numel(datalen)
    plot(data{k}(:,1), data{k}(:,2),'.')
    tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
    h{k} = plot(tvct, one_ODE(B,tvct,k), '-');
end
hold off
grid
legend([h{:}], compose('%d',1:numel(h)), 'Location','NW')

It now also adapts to any number of data sets, and uses the number of elements in the ‘h’ cell array (instead of ‘k’) to create the legend entries. The compose function works like sprintf and the others, so create any string or character array you want with it to display in the legend object.

Jakob24 2021-4-15

在 MATLAB Online 中打开

@Star Strider

Hello Star Strider,

Working on another model with the very neat code you built.

However, I've enocuntered a strange issue. If you run the pasted code with the attached data sets, you will see the blue "fit" starts just around 20h rather than 0h as the yellow fit does.

Why is this? Both data sets are starting close to time zero h. I don't see this behaviour for individual fits, only for the common fit which leave me to belive I might be doing something wrong.

Kindly check if you find time

Best regards,

Jakob

clear; clc; close all
data{1} = readmatrix('File4.xlsx');
data{2} = readmatrix('File5.xlsx');
datalen = cellfun(@(x)size(x,1),data); 
for k = 1:numel(data)
    tcat{k,:} = data{k}(:,1); 
    ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat); 
ydata = cell2mat(ycat);
x0 = [1e-4 1e-3 1e-3 1e-3];
[B,resnorm, residuals] = lsqcurvefit(@(x,data)omg_ODE(x,data,datalen), x0, tv, ydata); 
residualsum = sum(abs(residuals));
figure
hold on
for k = 1:numel(datalen)
    hd{k} = plot(data{k}(:,1), data{k}(:,2),'o');
    tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1))*1.6, 100);
    h{k} = plot(tvct, ODE(B,tvct,k), '-', 'Color',hd{k}.Color);
end
hold off
grid
xlabel('Time (h)')
ylabel('Concentration (mg/ml)')
legend([h{:}], compose('%6g mg/ml',[10 6.75])')
function Cmat = omg_ODE(x,data,datalen)
for k = 1:numel(datalen)
    idx = 1:datalen(k);
    if k>1
        idx = sum(datalen(1:k-1))+(1:datalen(k));
    end
    t = data(idx,1);
    Ccat{k,:} = ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
function C = ODE(x,t,k) 
c0 = [10 0 0 0 6.75 0 0 0];
[T,Cv] = ode15s(@DifEq, t, c0);
function dC = DifEq(t,c)
    izero = 2;
dcdt = zeros(8,1); 
dcdt(1) = - x(1)*c(1) + x(2)*c(2);
dcdt(2) =   x(1)*c(1) - x(2)*c(2) + izero*x(4)*c(3) - c(2)*[x(3) x(3)]*[c(3) c(4)]';
dcdt(3) =   x(3)*c(2)^izero - x(4)*c(3) - x(3)*c(2)*c(3);
dcdt(4) =   x(3)*c(3)*c(2) - x(4)*c(4)*c(2);
dcdt(5) = - x(1)*c(5) + x(2)*c(6);
dcdt(6) =   x(1)*c(5) - x(2)*c(6) + izero*x(4)*c(7) - c(6)*[x(3) x(3)]*[c(7) c(8)]';
dcdt(7) =   x(3)*c(6)^izero - x(4)*c(7) - x(3)*c(6)*c(7);
dcdt(8) =   x(3)*c(7)*c(6) - x(4)*c(8)*c(6);
dC = dcdt;
end
    C = Cv(:,4*k); %
end

Star Strider 2021-4-15

在 MATLAB Online 中打开

As always, my pleasure!

No worries! However that was the first option I checked for.

Unfortunately, GlobalSearch tries once and gives up for the original data, even when I gave it the previous fitted parameters as ‘x0’. Other times, it encounters a singularity in the ODE integration and just stops.

I expected better of it (specifically that it would try other initial parameter vectors), since it worked quite well other times I tried it, and with other problems.

The changes to the code to use it are:

fitfcn = @(x) norm(ydata - omg_ODE(x,tv,datalen));
x0 = [3.7517e-003     1.8992e-003     4.0151e-003  -169.0592e-006];
problem = createOptimProblem('fmincon', 'x0',x0, 'objective',fitfcn);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
% x0 = [1e-4 1e-3 1e-3 1e-3];
% [B,resnorm, residuals] = lsqcurvefit(@(x,data)omg_ODE(x,data,datalen), x0, tv, ydata); 
% residualsum = sum(abs(residuals));

with the rest of the code unchanged (so it simply replaces the lsqcurvefit call with a cost function that fmincon can use, and the GlobalSearch call and supporting code).

Star Strider 2021-4-18

As always, my pleasure!

I will experiment with fitnlm to see if I can get the variables presented to it to work. I do not understand the difference between it and lsqcurvefit such that it fails with fitnlm and works with lsqcurvefit.

Jakob24 2021-4-19