I want to compare fit() using option 'sin1' with the same equation defined using 'fittype'.
42 次查看(过去 30 天)
显示 更早的评论
I have used fit() with the option 'sin1' to fit a noisy set of data that very approximately look like a sine wave. To understand how 'fittype' works, I would like to fit the curve 'a*sin(b*x+c)' using 'fittype'. This should give the same answer. It does if the StartPoint is the answer; it does not if I omit StartPoint or put in some random values.
How do I make 'fittype' robust to a wide range of (noisy) data and different starting points. The standard option using fit() and 'sin1' does this.
I attach some simple sample code.
% I am trying to use the fit function to fit sine and/or cosine curves to a
% cyclic dataset that is very noisy. I am most interested in the frequency
% but the noise often prevents the FFT function from giving the frequency
% that I am interested in.
% This code uses the fit function with the predefined curve 'sin1' to
% calculate the best fit. This works very well. I then try and do the
% same thing using my own curve, 'a*sin(b*x+c)', i.e. the same curve, but I
% fail. I only succeed when the StartPoint is very close to the correct
% solution.
% I feel that I must be doing something wrong, but I cannot work out what
% it is.
xdata = (-19.5:19.5)'; % I think that any evenly spaced data will do
rdata = [-3.8699, -11.9740, -1.0781, 9.4637, ... % Real data from my application
18.2971, 3.7348, -11.5774, 0.3397, 8.4653, 7.7785, -1.4079, -13.4483, -7.6968, ...
5.1592, 6.4323, -8.2318, -14.8538, 0.8788, 10.9452, 5.6580, -9.0037, -10.0607, ...
-0.5338, 8.2230, 3.8553, 1.3426, -1.0028, -1.8683, 6.7254, 10.0907, 3.2068, ...
-7.2804, -4.3917, 6.5188, 5.7636, -3.0115, -13.8272, -12.7043, 5.0029, 9.9405]';
p = polyfit(xdata, rdata, 3); % Subtracting a cubic curve to level the ydata
py = polyval(p, xdata);
ydata = rdata - py;
[f, gof, output] = fit(xdata, ydata, 'sin1'); % Using the standard MATLAB curve, 'sin1'
fprintf(' ''sin1'': f.a: %-6.2f f.b: %-6.2f f.c: %-6.2f gof.sse: %-6.1f\n', ...
f.a1, f.b1, f.c1, (gof.sse/numel(xdata)));
xdata10 = xdata(1):0.1:xdata(end); % Defining extra xdata to give a smoother line
fy = feval(f, xdata10);
mysin = fittype('a*sin(b*x+c)', ... % My attempt to do the same thing
'dependent', {'y'}, 'independent', {'x'}, 'coefficients', {'a','b','c'});
[f2, gof2, output2] = fit(xdata, ydata, mysin, 'StartPoint', [1.0, 1.0, 1.0]);
fprintf(' mysin: f.a: %-6.2f f.b: %-6.2f f.c: %-6.2f gof.sse: %-6.1f\n', ...
f2.a, f2.b, f2.c, (gof2.sse/numel(xdata)));
fy2 = feval(f2, xdata10);
plot(xdata, ydata, 'xb:')
hold on
plot(xdata10, fy, ' m--')
plot(xdata10, fy2, ' r-')
legend('Real data', '''sin1'' fit', 'My fit');
hold off
2 个评论
dpb
2023-2-20
编辑:dpb
2023-2-20
xdata = (-19.5:19.5)'; % I think that any evenly spaced data will do
rdata = [-3.8699, -11.9740, -1.0781, 9.4637, ... % Real data from my application
18.2971, 3.7348, -11.5774, 0.3397, 8.4653, 7.7785, -1.4079, -13.4483, -7.6968, ...
5.1592, 6.4323, -8.2318, -14.8538, 0.8788, 10.9452, 5.6580, -9.0037, -10.0607, ...
-0.5338, 8.2230, 3.8553, 1.3426, -1.0028, -1.8683, 6.7254, 10.0907, 3.2068, ...
-7.2804, -4.3917, 6.5188, 5.7636, -3.0115, -13.8272, -12.7043, 5.0029, 9.9405]';
p = polyfit(xdata, rdata, 3); % Subtracting a cubic curve to level the ydata
py = polyval(p, xdata);
ydata = rdata - py;
plot(xdata,rdata)
hold on
plot(xdata,py)
p
The data are symmetric; the polynomial does nothing useful, at least for the sample dataset.
However, just as a minor simplification, MATLAB supplies the equivalent with detrend so one can write the above as
ydata=detrend(rdata,3);
if there are datasets for which it might be helpful.
As for the Q? posed, as the doc notes, sinN models are very sensitive to starting points and the MATLAB builtin solver uses an (AFAICT undocumented) optimization technique internally to get an initial starting point estimate and bounds the coefficients >0 whereas the custom model uses simply a randomized starting point on the [0,1] interval and no bounds on the coefficients.
To reproduce that result, you would also have to use a fit options object and starting points at least reasonably close to those the MATLAB heuristic model uses (which, of course, is undocumented/proprietary).
So, the Q? in return is, if the builtin model produces the desired results, why not use it instead? Other than simply knowing what the builtin algorithm is, which may be of interest and perhaps even needed (to publish?).
I couldn't in a quick look, find anything more about the algorithm TMW uses...whether there's any other information publicly available or not, I don't know.
采纳的回答
Matt J
2023-2-20
编辑:Matt J
2023-2-20
How do I make 'fittype' robust to a wide range of (noisy) data and different starting points. The standard option using fit() and 'sin1' does this.
When you specify 'sin1', the solver knows what kind of fit it is looking for and can choose an appropriate strategy for forming the initial StatPoint. Conversely, when you specify the fittype as a custom model, the burden falls upon you to do that.
Choosing a StartPoint randomly is not a shrewd strategy. You know for example, that
a=(max(ydata)-min(ydata))/2
will be a better choice for a than an arbitrary selection. You can also then evaluate curve at particular x,y points to solve approximately for the other parameters. For example, since Y(0)=a*sin(c), you could do,
Y0=interp1(xdata,ydata,0);
c=asin(Y0/a);
6 个评论
Matt J
2023-2-21
I am still working on getting a robust value of sp3 (or 'c') that is not overly dependent on the first value of my data.
The method in my comment above does so.
更多回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Linear and Nonlinear Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!