Parameter estimation for a system of differential equations with multiple time spans
8 次查看(过去 30 天)
显示 更早的评论
Jakob24
2021-3-19
Hello,
I've got four sets of experimental data with variables c1(t1), c2(t2), c3(t3) and c4(t4), each with their own time scale that is not common for all.
I'm interested in estimating two parameters, x1 and x2, of a model so that the parameters are the best fit for the four data sets, i.e. I want to fit one x1 and one x2 for all data-sets together, not one set of x per data set.
I've successfully managed to apply this to all four data sets (i.e. 4 ODEs) that are called in lsqcurvefit.
However, I can only manage to do this with one time vector with length equal to the longest C-vector.
This is not what I want, since the experimental data is dependent on four different time scales, the lsqcurvefit will fit one of the data sets good, while the remaining thee become erroneous.
So, is it possible to solve for and fit based on each time series?
% What I've got currently:
c = [c1, c2, c3, c4]
t = t1
% but I want:
c = [c1,c2,c3,c4]
t = [t1 t2 t3 t4]
% which is then called in:
x = lsqcurvefit(@one_ODE,x0,t,c)
% where ODE is a call to a function that solves one ODE per concentration
% (4 in total), that has another function nested in it, e.g.:
function C = one_ODE(x,t)
c0 = [54 0 27 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c)
dcdt = zeros(4,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(2) = ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(4) = ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dC = dcdt;
end
C = Cv; %
end
Thanks in advance.
Kind regards,
Jakob
采纳的回答
Star Strider
2021-3-19
This is definitely a new problem!
The usual way of dealing with it would be to vertically concatenate the time vectors and objective function matrices, and present all of them to lsqcurvefit at once, since lsqcurvefit ‘doesn’t care’ about the order of its inputs so long as they match.
I would be tempted to try something like this:
data{1} = readmatrix('File1.ext');
% . . .
data{4} = readmatrix('File4.ext');
function Cmat = cat_ODE(data)
for k = 1:4
t = data{k}(:,1);
Ccat{k,:} = oneODE(x,t);
end
Cmat = cell2mat(Ccat);
end
for k = 1:4
tcat = data{k}(:,1);
ycat{k,:} = data{k}(:,2:5);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
lsqcurvefit(@cat_DDE, x0, tv, ydata)
This assumes the the time vector is in the first column of the data sets, and that the other information are in the last four columns.
That is the only approach I can see to this problem. It would obviously be necessary to experiment with it.
84 个评论
Jakob24
2021-3-19
编辑:Jakob24
2021-3-19
@Star Strider, thank you for above.
I've got a few questions. Why are we calling cat_ODE in lsqcurvefit? Should it not be "one_ODE", as specified in the function file?
I'll post my code and a small excel sheet with random data and you can see what I've tried doing. I've used 2 datasets here only and will adapt it later to 4.
%% Function file:
function C = one_ODE(x,t)
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c)
dcdt = zeros(4,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(2) = ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(4) = ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dC = dcdt;
end
C = Cv(:,[2,4]);
end
% Separate script file:
clear; clc; close all;
data{1} = readmatrix('File1');
data{2} = readmatrix('File2');
for k = 1:2
tcat = data{k}(:,1);
ycat{k,:} = data{k}(:,1:2);
end
t = cell2mat(tcat);
c = cell2mat(ycat);
x0 = [1e-4 1e-2];
x = lsqcurvefit(@cat_ODE, x0, t, c)
function Cmat = cat_ODE(data)
for k = 1:2
t = data{k}(:,1);
Ccat{k,:} = one_ODE(x,t);
end
Cmat = cell2mat(Ccat);
end
lsqcurvefit(@cat_ODE, x0, t, c) % Why are we calling cat_ODE?
edit: sorted out typo
Now with error: "Brace indexing is not supported for variables of this type."
cellclass = class(c{1});
Error in one_script (line 11)
t = cell2mat(tcat);
Kind regards,
Jakob
Star Strider
2021-3-19
Before I go any farther on this, the Excel files each have 2 columns, the first I assume is the independent variable, and the second I assume is the dependent variable. Your ODE returns 4 columns. Which column should the data be compared ot? (I have no idea.)
Jakob24
2021-3-19
编辑:Jakob24
2021-3-19
Yes, the first column in each one is the independent variable i.e. time, while the second column is the dependent column concentration.
The ODE requires 4 equations to return the requested input but only two columns of input data (c1,c2) are needed, as I put C = Cv(:,[2,4]). What the first and third column are producing are the opposite of columns 2 and 4, but I am not interested in these.
The second column should be compared to "File1", while the fourth column corresponds to "File2".
Sorry if I was unclear
Also, perhaps concatinating everything into one column would result in issues with a non-strict increasing or decreasing tspan but should work fine if it can take each column individually
Kind regards,
Jakob
Star Strider
2021-3-19
This runs without error (except for occasional problems with the initial random parameter estimates), however the fit is definitely not good. That problem may be in the differential equations themselves, so they may need revision.
The Code —
data{1} = readmatrix('File1.xlsx');
data{2} = readmatrix('File2.xlsx');
datalen = cellfun(@(x)size(x,1),data);
function C = one_ODE(x,t,k)
c0 = [54 0 27 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c)
dcdt = zeros(4,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(2) = ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(4) = ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dC = dcdt;
end
C = Cv(:,2*k); %
end
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
idx = 1:datalen(k);
if k>1
idx = datalen(k-1)+(1:datalen(k));
end
t = data(idx,1);
Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
for k = 1:numel(data)
tcat{k,:} = data{k}(:,1);
ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
x0 = rand(2,1)*10;
B = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata)
figure
hold on
for k = 1:numel(datalen)
plot(data{k}(:,1), data{k}(:,2),'.')
plot(data{k}(:,1), one_ODE(B,data{k}(:,1),k), '-')
end
hold off
grid
It vertically concatensates the original data vectors, then uses ‘datalen’ to select the rows to be fitted in each iteration. (It also uses the size of ‘datalen’ itself to determine ‘k’.)
Since ‘k’ iterates from 1 to 2, this assignment in ‘one_ODE’:
C = Cv(:,2*k); %
determines the columns to be returned to lsqcurvefit in each iteration.
The code appears to work as designed, however the differential equations do not appear to be able to fit the data.
Jakob24
2021-3-19
编辑:Jakob24
2021-3-19
Hello @Star Strider
The code runs but, the plots are just very weird with negative estimates.
Did you get that too?
It might be te diffeqs, but I've fitted these data sets individually to get a much better fit (resnorm < 0.2) and with my initial attempts of fitting four to one time-range, it looked better (but still not accurate, of course).
EDIT: it works just fine - the initial rand values were just way off what I've used before - thank you! :)
Two final small questions:
1) Should be fairly easily adaptable to x sets of data, correct? C = Cv(:,2*k); % would this be C = Cv(:,4*k); if it iterates four data sets?
2) Is there any feasible way to:
i) make the fit smoother e.g. like linspacing the time vector if one time set is used?
ii) Choose colors for fit and raw data individually (and not combined)?
These are of course just details, look at them if you find time :)
Kind regards,
Jakob
Star Strider
2021-3-19
编辑:Star Strider
2021-3-19
I only got positive values and positive parameters in an otherwise unconstrained estimation. Part of the problem is that the two data sets are significantly different, so attempting to use one set of parameters to fit two obviously different data sets is likely not going to be optimal.
I suspect the parameter estimates from fitting the two data sets would be significantly different as well, and that one set of parameters would not work for both data sets. It might be worthwhile to consider using one of the Global Optimization Toolbox functions to search the entire parameter space for the best set of parameters, that dould fit both data sets, however even that may be overly optimistic. Since the initial parameter estimates could be the problem, what should the parameter ranges be, considering that you’ve fitted the two sets individually? Nonlinear parameter estimation routines and algorithms are extremely sensitive to the initial estimates, so appropriate initial estimates are important.
That aside, my code appears to do what was requested of it.
EDIT — (19 Mar 2021 at 18:16)
I was actually able to get a decent parameter fit with these parameters:
B =
201.1526e-006
7.8527e-003
producing this plot —
So I am now satisfied that my code works correctly and that it does what it is supposed to do!
Jakob24
2021-3-19
Hello,
I have not got a license for the global optimization toolbox so parameter estimation using non-linear fitting feels like the next best thing.
Well x0(1) = 1e-4 and x0(2) = 1e-2 are two fairly good guesses, around that the minimum is stable.
And yes - the individual fits are of course better
Star Strider
2021-3-19
I did not need the Global Optimization Toolbox to get the results in the plot I posted about 15 minutes ago. (Please see the edit in my earlier Comment.) By simply changing the original ‘x0’ to:
x0 = rand(2,1)*0.01;
I got my code to work and produce a decent fit to the data!
Jakob24
2021-3-19
Yes - I got similar results with slightly different initial values of the parameters!
I will accept this answer - many thanks! I will study your code in detail to make sure I fully understand all steps.
Two small questions:
What in the code has to be changed to adapt it to x number of datasets? From what I can see just a lot of swapping indexes and expanding the plot command, as well as this to: C = Cv(:,x*k); % where x is the number of data sets, since k will iterate from 1:x?
Is it possible to get a smooth curve fit without sharp edges?
Usually I'd plot the estimate with e.g. tt = linspace(t(min,t(max)) instead of t only. Does not feel as simple this time with the plot command nested in a for loop.
Kind regards,
Jakob
Star Strider
2021-3-19
As always, my pleasure!
Adapting it to a number of data sets would likely require only one change in ‘cat_ODE’, specifically with respect to the ‘idx’ variable:
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
idx = 1:datalen(k);
if k>1
idx = sum(datalen(1:k-1))+(1:datalen(k)));
end
t = data(idx,1);
Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
That works with the present data set. I cannot test it on larger data sets. If the data are stored as a cell array (as I did nere), the rest should work. That part of the code is designed so that the appropriate index range for each part of the data set addresses the correct range in the concatenated independent and dependent data vectors.
It really is as simple as using linspace to create a new time vector. This creates a new time vector (here ‘tvct’) for each iteration of the plotting loop:
figure
hold on
for k = 1:numel(datalen)
plot(data{k}(:,1), data{k}(:,2),'.')
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
plot(tvct, one_ODE(B,tvct,k), '-')
end
hold off
grid
producing:
This one has 150 elements, change that to get more or fewer points.
.
Jakob24
2021-3-19
编辑:Jakob24
2021-3-19
@Star Strider very nice!
I did try to adapt the code to an additional dataset by importing it, adding corresponding Dif.Eqs, initial values etc. I don't see why it would not work, but it fails on idk in Cmat:
Matrix dimensions must agree.
Error in one_script>cat_ODE (line 39)
idx = datalen(1:k-1)+(1:datalen(k));
Jakob24
2021-3-19
Attaching "File3" as an additional input and pasting my updated code below:
clear; clc; close all
% Vertically concatenate original data:
data{1} = readmatrix('File1.xlsx');
data{2} = readmatrix('File2.xlsx');
data{3} = readmatrix('File3.xlsx');
datalen = cellfun(@(x)size(x,1),data); % uses datalen to select rows to be fitted in each iteration and to determine the size of "k"
for k = 1:numel(data) %
tcat{k,:} = data{k}(:,1);
ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
x0 = [1e-4, 1e-2]; % initial value guesses
B = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata)
figure
hold on
for k = 1:numel(datalen)
plot(data{k}(:,1), data{k}(:,2),data{k}(:,3)'.')
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
plot(tvct, one_ODE(B,tvct,k), '-')
end
hold off
grid
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
idx = 1:datalen(k);
if k>1
idx = datalen(1:k-1)+(1:datalen(k));
end
t = data(idx,1);
Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
function C = one_ODE(x,t,k)
c0 = [54 0 27 0 13.5 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c)
dcdt = zeros(6,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(2) = ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(4) = ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(5) = - ((x(1).*c(5)) + (x(2).*c(5).*c(6)));
dcdt(6) = ((x(1).*c(5)) + (x(2).*c(5).*c(6)));
dC = dcdt;
end
C = Cv(:,3*k); %
end
Star Strider
2021-3-19
The error the code throws is not related to the dimensions but to the differential equations, specifically the Warning that ode45 throws is:
Warning: Failure at t=4.257197e+00. Unable to meet integration tolerances without
reducing the step size below the smallest value allowed (1.421085e-14) at time t.
and as the result, lsqcurvefit throws:
Arrays have incompatible sizes for this operation.
The code works, however ode45 encounters a singularity in the differential equations, and stops, creating the dimension incompatibility.
My current code:
data{1} = readmatrix('File1.xlsx');
data{2} = readmatrix('File2.xlsx');
data{3} = readmatrix('File3.xlsx');
datalen = cellfun(@(x)size(x,1),data);
function C = one_ODE(x,t,k)
c0 = [54 0 27 0 13.5 0];
[T,Cv] = ode45(@DifEq, t, c0);
function dC = DifEq(t,c)
dcdt = zeros(6,1);
dcdt(1) = - ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(2) = ((x(1).*c(1)) + (x(2).*c(1).*c(2)));
dcdt(3) = - ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(4) = ((x(1).*c(3)) + (x(2).*c(3).*c(4)));
dcdt(5) = - ((x(1).*c(5)) + (x(2).*c(5).*c(6)));
dcdt(6) = ((x(1).*c(5)) + (x(2).*c(5).*c(6)));
dC = dcdt;
end
C = Cv(:,2*k); %
end
function Cmat = cat_ODE(x,data,datalen)
for k = 1:numel(datalen)
idx = 1:datalen(k);
if k>1
idx = sum(datalen(1:k-1))+(1:datalen(k));
end
t = data(idx,1);
Ccat{k,:} = one_ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
for k = 1:numel(data)
tcat{k,:} = data{k}(:,1);
ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
x0 = rand(2,1)*0.000001;
B = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata)
figure
hold on
for k = 1:numel(datalen)
plot(data{k}(:,1), data{k}(:,2),'.')
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
plot(tvct, one_ODE(B,tvct,k), '-')
end
hold off
grid
However, reducing ‘x0’ further:
x0 = rand(2,1)*0.000001;
results in:
B =
168.7523e-006
8.2923e-003
and this plot —
And my code proves to be robust to additional data sets!
.
Jakob24
2021-3-19
Yep, added a fourth data set and it works!
Initial guesses are very very sensitive to integration tolerance limits though, even with ode15s. I'm guessing it has to do with fitting several datasets that differ by several orders of magnitude
Jakob24
2021-3-20
@Star Strider Is it possible to insert a function handle here to omit the legend entries of the raw data?
e.g. to show only the fits.
figure
hold on
for k = 1:numel(datalen) % loop to plot raw data and fit
plot(data{k}(:,1), data{k}(:,2),'.')
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
h = plot(tvct, one_ODE(B,tvct,k), '-');
end
hold off
legend(h,{'1','2','3','5'})
%%%% does not seem to work
Star Strider
2021-3-20
‘I'm guessing it has to do with fitting several datasets that differ by several orders of magnitude’
Correct. Using ode15s is a good choice (ode23s would also work) since the parameters now vary by several orders-of-magnitude. (This is where the Global Optimization Toolbox functions would be appropriate if additional data sets make the fitting less than straightforward.)
‘Is it possible to insert a function handle here to omit the legend entries of the raw data?’
Yes! Although not a function handle, instead a handle to the line object.
Change the figure loop to:
figure
hold on
for k = 1:numel(datalen)
plot(data{k}(:,1), data{k}(:,2),'.')
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
h{k} = plot(tvct, one_ODE(B,tvct,k), '-');
end
hold off
grid
legend([h{:}], compose('%d',1:numel(h)), 'Location','NW')
It now also adapts to any number of data sets, and uses the number of elements in the ‘h’ cell array (instead of ‘k’) to create the legend entries. The compose function works like sprintf and the others, so create any string or character array you want with it to display in the legend object.
Jakob24
2021-3-20
Thank you, very neat solution!
Is it possible to adjust each legend fit individually?
As the compose would display the same array for each element in 'h'. Perhaps I'm doing it wrong!
Kind regards,
Jakob
Star Strider
2021-3-20
As always, my pleasure!
‘Is it possible to adjust each legend fit individually?’
I am not certain what you want. I am using R2021a, and it works as I would expect it to work.
.
Star Strider
2021-3-20
That simply involves changing the format string in the compose call:
legend([h{:}], compose('Fit %2d',1:numel(h)), 'Location','NW')
You can also use the TeX and LaTeX commands:
legend([h{:}], compose('Fit_{%2d}',1:numel(h)), 'Location','NW')
although if you use LaTeX expressions, it will be necessary to specify 'Interpreter','latex' in the legend call (not the compose call), and format it slightly differently (note the $ operators):
legend([h{:}], compose('$Fit_{%2d}$',1:numel(h)), 'Location','NW', 'Interpreter','latex')
Experiment to get different results.
Jakob24
2021-3-20
Yeah, that makes sense.
What if you want a string specific for each curve?
e.g. calling them "12" "24" "36" "48" or something random? Perhaps "fit 1-4" was a bad example by me since it's a common label sorted by one string. Specific/non-common labels was what I originally meant :)
Star Strider
2021-3-20
Call them whatever you like in the legend call! The code cares not at all.
The important thing is that it makes sense for you, although the structure of the legend call (and whether compose will work in that context) may change.
So if you want to call them by those numbers, the compose call would change to:
compose('Fit_{%2d}',(1:numel(h))*12)
with everything else remaining the same.
If it’s random, compose could handle a random vector, although if the strings themselves change for each curve, those would need to be entered individually, since it would be unlikely that compose could work with those. You would need to experiment with compose to determine that.
Star Strider
2021-3-20
Experiment with this:
legend([h{:}], compose('%6gM', [54,27,13.5,10]))
See the documentation with respect to the different format specifiers to change the representation of the numbers.
Jakob24
2021-3-20
Yes, tried different forms of vector inputs in 'compose', does not work out.
Need to include 1:numel(h)) in compose. This was trickier than I thought.
Though it's only a detail compared to the whole solution to this, it's nice to be able to control the legends
Star Strider
2021-3-20
I am not certain what to suggest if the compose call does not do what you want. It may be necessary to just enter the strings manually.
What specifically do you want to do, and what is compose not doing that you want it to, or what is compose doing that you do not want it to do?
Jakob24
2021-3-20
I would simply like four different strings, each one describing each fit.
Four fits = four different strings.
Compose - with a vector input - produces one string only that is the same for each fit (see attached image in my previous reply).
-Jakob
Star Strider
2021-3-20
In that event, writing the strings yourself and putting them in a cell array is the only option.
Jakob24
2021-3-20
A cell array did the job!
%%%% This was really all that was needed :)
Cell_array = {'54M', '27M', '13.5M', '10M'};
legend([h{:}], Cell_array{1}, Cell_array{2}, Cell_array{3}, Cell_array{4})
Star Strider - Thank you once again for all help!
Kind regards,
Jakob
Star Strider
2021-3-20
Actually, you can do that as:
legend([h{:}], Cell_array)
Also, I though that this:
legend([h{:}], compose('%6gM', [54,27,13.5,10]))
did exactly the same thing.
Jakob24
2021-3-20
Perhaps overambigious but is it possible to color the fits the same as the corresponding data set?
Star Strider
2021-3-20
It is!
The loop changes to:
for k = 1:numel(datalen)
hd{k} = plot(data{k}(:,1), data{k}(:,2),'.');
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1)), 150);
h{k} = plot(tvct, one_ODE(B,tvct,k), '-', 'Color',hd{k}.Color);
end
I tested that and it works.
Jakob24
2021-4-15
Hello Star Strider,
Working on another model with the very neat code you built.
However, I've enocuntered a strange issue. If you run the pasted code with the attached data sets, you will see the blue "fit" starts just around 20h rather than 0h as the yellow fit does.
Why is this? Both data sets are starting close to time zero h. I don't see this behaviour for individual fits, only for the common fit which leave me to belive I might be doing something wrong.
Kindly check if you find time
Best regards,
Jakob
clear; clc; close all
data{1} = readmatrix('File4.xlsx');
data{2} = readmatrix('File5.xlsx');
datalen = cellfun(@(x)size(x,1),data);
for k = 1:numel(data)
tcat{k,:} = data{k}(:,1);
ycat{k,:} = data{k}(:,2);
end
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
x0 = [1e-4 1e-3 1e-3 1e-3];
[B,resnorm, residuals] = lsqcurvefit(@(x,data)omg_ODE(x,data,datalen), x0, tv, ydata);
residualsum = sum(abs(residuals));
figure
hold on
for k = 1:numel(datalen)
hd{k} = plot(data{k}(:,1), data{k}(:,2),'o');
tvct = linspace(min(data{k}(:,1)), max(data{k}(:,1))*1.6, 100);
h{k} = plot(tvct, ODE(B,tvct,k), '-', 'Color',hd{k}.Color);
end
hold off
grid
xlabel('Time (h)')
ylabel('Concentration (mg/ml)')
legend([h{:}], compose('%6g mg/ml',[10 6.75])')
function Cmat = omg_ODE(x,data,datalen)
for k = 1:numel(datalen)
idx = 1:datalen(k);
if k>1
idx = sum(datalen(1:k-1))+(1:datalen(k));
end
t = data(idx,1);
Ccat{k,:} = ODE(x,t,k);
end
Cmat = cell2mat(Ccat);
end
function C = ODE(x,t,k)
c0 = [10 0 0 0 6.75 0 0 0];
[T,Cv] = ode15s(@DifEq, t, c0);
function dC = DifEq(t,c)
izero = 2;
dcdt = zeros(8,1);
dcdt(1) = - x(1)*c(1) + x(2)*c(2);
dcdt(2) = x(1)*c(1) - x(2)*c(2) + izero*x(4)*c(3) - c(2)*[x(3) x(3)]*[c(3) c(4)]';
dcdt(3) = x(3)*c(2)^izero - x(4)*c(3) - x(3)*c(2)*c(3);
dcdt(4) = x(3)*c(3)*c(2) - x(4)*c(4)*c(2);
dcdt(5) = - x(1)*c(5) + x(2)*c(6);
dcdt(6) = x(1)*c(5) - x(2)*c(6) + izero*x(4)*c(7) - c(6)*[x(3) x(3)]*[c(7) c(8)]';
dcdt(7) = x(3)*c(6)^izero - x(4)*c(7) - x(3)*c(6)*c(7);
dcdt(8) = x(3)*c(7)*c(6) - x(4)*c(8)*c(6);
dC = dcdt;
end
C = Cv(:,4*k); %
end
Star Strider
2021-4-15
I will do what I can to help!
The blue ‘10 mg’ line plots normally and correctly. It’s overplotted slightly by the ‘6.75 mg’ yellow line.
This is most easily seen with this addition just after the ‘h{k}’ assignment in the first loop:
h{1}.LineWidth = 2;
Then the blue line shows that it plots completely. (Delete that later, or change the line width to something smaller if you want to keep in the code to show the blue line more prominently.)
(The only thing that bothers me is that the blue line does not fit the data well. Changing the ‘x0’ values or using something such as GlobalSearch could correct that. The code would not change, however the parameters could fit better. A recent example from another Answer is in Fit of multiple data set with variable parameters .)
Jakob24
2021-4-15
Hello,
Thank you, to me it was not clear it was overplotted!
Would you mind illustrating the global optimization toolbox on this code? I have never used before.
I'll attach a third data-set as I suspect adding that will make the fit very difficult, but perhaps the Global optmisation toolbox may handle that
Star Strider
2021-4-15
As always, my pleasure!
No worries! However that was the first option I checked for.
Unfortunately, GlobalSearch tries once and gives up for the original data, even when I gave it the previous fitted parameters as ‘x0’. Other times, it encounters a singularity in the ODE integration and just stops.
I expected better of it (specifically that it would try other initial parameter vectors), since it worked quite well other times I tried it, and with other problems.
The changes to the code to use it are:
fitfcn = @(x) norm(ydata - omg_ODE(x,tv,datalen));
x0 = [3.7517e-003 1.8992e-003 4.0151e-003 -169.0592e-006];
problem = createOptimProblem('fmincon', 'x0',x0, 'objective',fitfcn);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
% x0 = [1e-4 1e-3 1e-3 1e-3];
% [B,resnorm, residuals] = lsqcurvefit(@(x,data)omg_ODE(x,data,datalen), x0, tv, ydata);
% residualsum = sum(abs(residuals));
with the rest of the code unchanged (so it simply replaces the lsqcurvefit call with a cost function that fmincon can use, and the GlobalSearch call and supporting code).
Jakob24
2021-4-15
Hi,
Yeah, I’m still considering the span in orders of magnitude in the data to be an issue here. Not really sure how I would handle that. I think regardless model, that issue will cause singularities. Perhaps global search does better with the original model?
Star Strider
2021-4-15
Using it with the original model is certainly worth trying. This is a difficult problem for any optimisation function, so I am not surprised there are problems with it. Perhaps other global optimisaation routines would have better luck with it, especially if they keep searching the parameter space if the objective function has problems.
If I come up with a better solution, I will post back here with it. I cannot promise that a good solution exists.
Jakob24
2021-4-15
I understand, thank you once again! I Will keep a look out for this thread if you do post another suggestion. But I do really think 4-5 orders of magnitude in YDATA between some of the data sets to be of significance. Probably not a model limitation nor parametrization issue
Star Strider
2021-4-15
I have no idea what the problem could be with createOptimProblem, however it also requires the Optimization Toolbox to use fmincon, so it is necessary to have that installed as well.
Jakob24
2021-4-15
Right so I've been experimenting with it for the past hour.
What's odd to me is that, when fitting the data sets alone, lsqcurvefit does a good job. However, putting the same values of x0 in Global, it cannot meet the tolerance limits.
I thought global would be more robust (or perhaps it is because then sensitivity is higher?)
Star Strider
2021-4-16
The GlobalSearch and others usually are more robust. For whatever reason, they do not appear to work optimally with your data and functions.
Jakob24
2021-4-16
That sounds very nice!
I will try some other data sets with these models to see how they perform.
I might also try to normalize them to a common time-scale and use that with the monod-kinetics approach you illustrated in another post.
Though I don't see different time scales to be of issue, perhaps it is possible the fit weights too much on one data set?
Star Strider
2021-4-16
‘... perhaps it is possible the fit weights too much on one data set?’
That is certainly possible, although it seems to depend on the data set (and perhaps the model if it changed), since the original approach appeared to fit both data sets reasonably well.
Jakob24
2021-4-16
@Star Strider Well, here are some examples:
The first image (img1) I've got four datasets with the original model - looks quite fine.
Second image (img2) - I've added a 6.75 mg/ml data-set which cannot be fit what so ever.
Third image )img3) - I've sencored the 6.75 mg/ml data-set, meaning removed data of low magnitude up to 40 mg/ml (around 60 hours this data sets starts to increase visually), and now I can fit all five data sets.
These are local minima searhes with lsqcurvefit. However, it is fairly sensitive and encounters singularities (even with ODE15s).
To be honest I don't know if this is a model limitation, parameter estimating issue, improper optimization algorithm, or skew data. Seeing as the fit was possible after censuring the 6.75 mg/ml data-set, I always end up with a larger difference in magnitute as the issue in my thoughts...
Star Strider
2021-4-16
There simply may be limits on how robust the fit can be with several data sets.
Jakob24
2021-4-16
编辑:Jakob24
2021-4-16
Yeah, I'm considering that though.
However, fitting only two datasets, one of them being the 6.75 mg/ml, does not work. It's not possible to fit that data-set with any of the other ones which is strange. Also perhaps due to magnitude difference
fitfcn = @(x) norm(ydata - omg_ODE(x,tv,datalen));
x0 = [1e-6 0.001 0.001 0.001];
problem = createOptimProblem('fmincon', 'x0',x0, 'objective',fitfcn);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
Would
modelfit = fitnlm(ydata,tv,fitfcn,x0)
work to produce some statistics of the fit? (for me this line does not work since fitnlm is regarded undefined with regards to the function handle. Perhaps I've not written it correctly?
Is it possible to plot the results of fitnlm in the fit?
Star Strider
2021-4-16
The fitnlm function (that requires the Statistics and Machine Learning Toolbox) would generate the necessary statistics.
It would be possible to get statistics on the fit with fitnlm, however the Statistics and Machine Learning Toolbox nonlinear parameter estimation funcitons will only fit vector dependent variables, not matrix dependent variables.
Jakob24
2021-4-16
Hi.
Yeah I've installed that add-on now. I've ever only used the default before.
tdata = data(:,1);
cdata = data(:,2);
fitfcn = @(x) norm(cdata - HSAoptimization(x,tdata));
x0 = [1e-5 1e-4];
problem = createOptimProblem('fmincon', 'x0',x0, 'objective',fitfcn);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
modelfit = fitnlm(cdata,tdata,fitfcn,x0)
What additional input would I need here? fitnlm does not work as currently written:
Error using classreg.regr.NonLinearFormula (line 226)
The model function must accept two arguments.
Error in NonLinearModel/createFormula (line 1659)
formula =
classreg.regr.NonLinearFormula(modelDef,coefNames,predictorVars,responseVar,varNames,ncoefs);
Error in NonLinearModel.fit (line 1413)
model.Formula = NonLinearModel.createFormula(supplied,modelDef,X,ncoefs,coefNames, ...
Error in fitnlm (line 99)
model = NonLinearModel.fit(X,varargin{:});
/Jakob
Star Strider
2021-4-16
Use the same function as with lsqcurvefit, specifically:
@(x,data)omg_ODE(x,data,datalen)
That will work with fitnlm.
Jakob24
2021-4-16
Like so? Still not enough input arguments. Does it require datalen somewhere in there as well?
tv = cell2mat(tcat);
ydata = cell2mat(ycat);
fitfcn = @(x) norm(ydata - omg_ODE(x,tv,datalen));
x0 = [1e-6 1e-4];
problem = createOptimProblem('fmincon', 'x0',x0, 'objective',fitfcn);
gs = GlobalSearch('PlotFcns',@gsplotbestf);
[B,fval] = run(gs,problem)
modelfit = fitnlm(ydata,tv,omg_ODE,x0)
Or is it not compatible with this code due to matrix? If so, is there any other statistical modelling I can apply and subsequently plot?
Star Strider
2021-4-16
Yes. For the code to work correctly, ‘datalen’ has to exist in the workspace.
I initially created it to return the row sizes of the ‘data’ cell arrays as:
datalen = cellfun(@(x)size(x,1),data);
So it should be there.
Jakob24
2021-4-16
Yeah I figured.
What would the correct input be? I still cannot get it right still.
Star Strider
2021-4-16
Input to what?
The ‘datalen’ computation uses the ‘data’ cell arrays read in from the files in the readmatrix calls just prior to it.
One small though important point is that the ‘x0’ value for fitnlm needs to be ‘B’ from the GlobalSearch result. It may tweak the parameters slightly (this is what it is supposed to do, so that’s a side benefit), and it will produce the desired statistics as well.
Jakob24
2021-4-16
Sorry, I meant the input arguments to fitnlm. I know the generic case but I can’t get it to work
Star Strider
2021-4-16
I cannot get fitnlm to work either. It throws a data size incompatibility error, however the data sizes are the same. I do not understand what the problem is, since the data sizes are compatible and everything worked with lsqcurvefit.
Jakob24
2021-4-16
Everything is strange with this case, haha! I agree, I can’t see why it would not work. Maybe there is a workaround or some other alternative for some nice statistics?
Star Strider
2021-4-16
The Statistics and Machine Learning Toolbox will not provide all of them, unfortunately. I already did some other experiments in that regard. This will likely require going back to lsqcurvefit and gathering more of its outpus, since it may be possible to use them with nlparci to get confidence intervals on the parameters. I doiubt that it would be possible to get statistics on the fit itself, though.
Jakob24
2021-4-16
Well, that's a workaround so for me, wether it is possible to get them on the fit iself or not, is plausible
Star Strider
2021-4-16
It would likely be difficult to plot statistics on the fit anyway, considering the fit is on different data sets. Consult the documentation on nlparci and lsqcurvefit to understand what the first needs and what the second can provide to estimate the parameter confidence intervals.
Jakob24
2021-4-16
I guess this is correct for two data sets?
[B, resnorm, r, exitflag, output, lambda, J] = lsqcurvefit(@(x,data)omg_ODE(x,data,datalen), x0, tv, ydata);
ci = nlparci(B,r,'Jacobian',J)
B =
1.0e-03 *
0.0187 -0.1350
ci =
1.0e-03 *
0.0126 0.0247
-0.2171 -0.0529
Star Strider
2021-4-16
It is.
The problem is that both parameters include 0 in the confidence intervals (are not statistically different from 0), meaning that neither are necessary in the fit.
Jakob24
2021-4-18
I did look more into this and with some other data, I got parameter estimates nicely within 95% CI using nlparci and lsqcurvefit.
I have a few small questions.
- Since the global fit was not compatible with fitnlm, would it be compatible with the local fit by lsqcurvefit?
- Is it a big no trying to plot statistics on the fits?
3. Regarding the plots - I am getting fairly pale graphs and low resolution which I assume has to do with rendering in ZBuffer or open GL instead of painters. I've tried to force matlab into using painters by
print -painters
Without success. Perphaps I am not positioning it in the right location of the code, or is this an impossible task due to the complexity? The overall quality feels lower even with e.g. export_fig not rendering the graphs with painters, but perpahs there is a workaround/alternative way of exporting to improve quality?
Star Strider
2021-4-18
1. The purpose for using GlobalSearch was to search the entire parameter space to see if a better parameter set was possible. The MultiStart function is also an option. (I also considered using ga, however I doubt if this is an appropriate problem for ga.) The idea behind using lsqcurvefit on the results of a global search would be to use those parameters to calculate the desired statistics.
2. Plotting the statistics (I assume by that the confidence limits on the fit) is appropriate, although I am not certain that it would be possible here.
3. I am not certain what the problem with the graphics rendering could be. One option is to see if a graphics card driver update is available.
Jakob24
2021-4-18
- Well the global parametrisation performed very similar to the local one using lsqcurvefit. so I will stick to lsqcurvefit as it allows me to obtain the confidence intervals. If I recall correctly, none of us got fitnlm to work with Globalsearch. Therefore, would fitnlm work with lsqcurvefit to produce desired statistics? I.e. something like:
x0 = [1e-5 1e-4];
[B,..] = lsqcurvefit(@(x,data)cat_ODE(x,data,datalen), x0, tv, ydata)
fitfcn = @(x) norm(ydata - one_ODE(x,tv,datalen));
modelfit = fitnlm(tv,datalen,fitfcn,x0)
% If that makes any sense
2. Yes, I was refering to the confidence limits on the fit. I can't see where I would start such attempt here with multiple time spans
3. Well I can compare with another computer that has a better graphics card. Usually, plotting only one curve/one data-set looks fine, while several ones reduce quality (which is why I thought the rendering engine would be the cause)
Star Strider
2021-4-18
1. The fitnlm function will not work with GlobalSearch (or likely with MultiStart) because they are from different Toolboxes. The Global Optimization Toolbox functions apparently work only with the Optimization Toolbox functions. The fitnlm function is not compatible with the code that estimates both curves, however I havee no idea what the problem is.
2. Me, either.
3. I have no idea, unless the graphics card memory would somehow be limited, and it is making trade-offs.
Jakob24
2021-4-18
1. So essentially I am limited to the statistics I’ve currently got as it does not seem compatible with multiple curves? I’m interested in e.g. The T-test etc. Fitnlm does provide. Although should it not be possible, What I have currently might do.
Star Strider
2021-4-18
Unfortunately, yes.
The fitnlm function does not work with the code written to fit both curves. It should, since there is nothing inhernetly incompatible with it, however it throws an error and stops. (That has to do with nlinfit that throws the same error and that fitnlm apparently uses.)
Star Strider
2021-4-18
As always, my pleasure!
I will experiment with fitnlm to see if I can get the variables presented to it to work. I do not understand the difference between it and lsqcurvefit such that it fails with fitnlm and works with lsqcurvefit.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Quadratic Programming and Cone Programming 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!发生错误
由于页面发生更改,无法完成操作。请重新加载页面以查看其更新后的状态。
您也可以从以下列表中选择网站:
如何获得最佳网站性能
选择中国网站(中文或英文)以获得最佳网站性能。其他 MathWorks 国家/地区网站并未针对您所在位置的访问进行优化。
美洲
- América Latina (Español)
- Canada (English)
- United States (English)
欧洲
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
亚太
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)