polyfit doesn't fit the data

Question

florence briton 2015-5-19

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/217545-polyfit-doesn-t-fit-the-data

回答： Augusto Samussone 2021-3-17

>> mdl=fitlm(X,Y)
mdl = 
Linear regression model:
    y ~ 1 + x1
Estimated Coefficients:
                   Estimate      SE        tStat       pValue  
                   ________    _______    _______    __________
      (Intercept)     16.325      1.8303     8.9194     1.054e-16
      x1             -1.1809     0.14751    -8.0059    4.5928e-14
Number of observations: 250, Error degrees of freedom: 248
Root Mean Squared Error: 1.19
R-squared: 0.205,  Adjusted R-Squared 0.202
F-statistic vs. constant model: 64.1, p-value = 4.59e-14
>> plot(X,Y,'.')
hold on
plot(X,polyval(p,X),'r.')
hold on
f=16.3250-1.1809*X
plot(X,f,'.b')
Attached figure:
in blue: data
in red: polyfit regression
in green: fitlm model

polyfit does not fit the data whereas fitlm does. Is there anything I can do to fix that? I would rather not use fitlm as I have to do thousand of regressions and it seems more complex and using more memory

5 个评论
显示 3更早的评论隐藏 3更早的评论

John D'Errico 2015-5-19

Usually, when someone says something like this, they did not really do what they think they did. The fact is, polyfit WILL generate the same model as does fitlm. Here for example, you don't show where you did the polyfit call. Did you use the same data? What were the coefficients produced by polyfit?

florence briton 2015-5-19

编辑：Matt J 2015-5-19

在 MATLAB Online 中打开

Yes I called polyfit on the same data. I just did it again (variables have another name but they are the same)

>> [p,s,mu]=polyfit(logX,logY,1);
>> plot(logX,logY,'.')
hold on
plot(logX,polyval(p,logX),'r.')

polyfit produces:

p=[-0.6039,1.6844]
mu=[12.3973;0.5114]
s.R=[15.7797,-5.7510e-14;0,-15.8114]
s.df=248
s.normr=18.7448

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Titus Edelhofer 2015-5-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/217545-polyfit-doesn-t-fit-the-data#answer_179645

在 MATLAB Online 中打开

Hi,

I guess it's another problem here: note, that if you call polyfit with

[p,s,mu] = polyfit ...

you get the scaled and shifted representation of the polynomial. Therefore you would need to take mu to transform logX->\hat logX (see doc of polyfit).

If you call polyfit only with output p you get a different result that coincides with fitlm.

Titus

2 个评论
显示无隐藏无

florence briton 2015-5-19

Thank you for your answer Titus, I didn't notice that point

Geoff 2018-5-21

I realize this post is now three years old but as of 2018a, the help file describing polyfit is not clear.

In the help file, p is described as the coefficients of the unscaled and uncentered data. In the description of the [p,S,mu] syntax, however, there is no mention that the returned p is now of the scaled and centered data.

Perhaps Mathworks should change their description and use [phat,S,mu] rather than p, or automatically rescale polynomial coefficients back to the original space if scaling and centering are used.

请先登录，再进行评论。

Answer 2

John D'Errico 2015-5-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/217545-polyfit-doesn-t-fit-the-data#answer_179642

编辑：John D'Errico 2015-5-19

在 MATLAB Online 中打开

GIVE US A BREAK!

You used polyfit on the log of the data! But you used fitlm on the unlogged data! Here are the calls you yourself showed:

mdl=fitlm(X,Y)
[p,s,mu]=polyfit(logX,logY,1);

I've just copied what you yourself typed. Don't tell us that the variables are the same, but they just have a different name. Show us what happens when you use polyfit like this:

p = polyfit(X,Y,1)

Please tell us why we should be surprised that there is a difference. If you change the data, then expect to get a different answer.

Next, READ THE HELP! From the help for polyfit, it tells us that when you call it with THREE output arguments, it performs a centered and scaled regression.

[P,S,MU] = polyfit(X,Y,N) finds the coefficients of a polynomial in
  XHAT = (X-MU(1))/MU(2) where MU(1) = MEAN(X) and MU(2) = STD(X). This
  centering and scaling transformation improves the numerical properties
  of both the polynomial and the fitting algorithm.

So in order to predict the result, you need to use the centered and scaled variable.

XHAT = (X-MU(1))/MU(2);

where MU(1) = MEAN(X) and MU(2) = STD(X). For example...

X = rand(10,1);
X = 10 + 100*rand(10,1);
Y = rand(size(X));
p = polyfit(X,Y,1)
p =
   -0.0047359      0.94733
[p,S,mu] = polyfit(X,Y,1)
p =
     -0.15668      0.58731
S = 
        R: [2x2 double]
       df: 8
    normr: 0.7445
mu =
       76.021
       33.084

See that there IS a difference in the coefficients produced. READ THE HELP. What you actually did wrong is only for you to know, since you have not proved to us that logX and X are truly the same thing, as with logY and Y.

Ok. Since you actually gave us the results from polyfit, lets try something:

p=[-0.6039,1.6844];
mu=[12.3973;0.5114];
syms X Y
xhat = (X - mu(1))/mu(2);
yhat = p(1)*xhat + p(2);
vpa(yhat,10)
ans =
16.32407436 - 1.180876027*X

AGAIN, IF you use polyfit with THREE output arguments, it produces a DIFFERENT model. You can recover the untransformed model as I did, but if you can't bother to read the help, what do you expect?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 3

florence briton 2015-5-19

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/217545-polyfit-doesn-t-fit-the-data#answer_179649

在 MATLAB Online 中打开

No no it's just that I changed the names but it is confusing I am sorry. I do the whole thing again:

>> [p,s,mu]=polyfit(logX,logY,1);

polyfit produces:

p=[-0.6039,1.6844] mu=[12.3973;0.5114] s.R=[15.7797,-5.7510e-14;0,-15.8114] s.df=248 s.normr=18.7448

>> mdl=fitlm(logX,logY)

mdl =

Linear regression model: y ~ 1 + x1

Estimated Coefficients: Estimate SE tStat pValue ______ _____ _____ ________

    (Intercept)     16.325      1.8303     8.9194     1.054e-16
    x1             -1.1809     0.14751    -8.0059    4.5928e-14

Number of observations: 250, Error degrees of freedom: 248 Root Mean Squared Error: 1.19 R-squared: 0.205, Adjusted R-Squared 0.202 F-statistic vs. constant model: 64.1, p-value = 4.59e-14

>> plot(logX,logY,'.')

hold on

plot(logX,polyval(p,logX),'r.')

hold on

f=16.3250-1.1809*logX;

plot(logX,f,'.g')

2 个评论
显示无隐藏无

John D'Errico 2015-5-19

编辑：John D'Errico 2015-5-19

在 MATLAB Online 中打开

ARGH!

READ THE ANSWERS!

First, PROVE TO USE THAT X AND logX ARE THE SAME. For example:

min(X-logX)
max(X-logX)

should both be essentially zero.

Then call polyfit using a call that WILL PRODUCE the same result!

p = polyfit(X,Y,1)

READ THE HELP FOR POLYFIT! READ THE ANSWERS. Don't just keep repeating the same nonsense. (See my edit to my answer, where I show how the coefficients that you got ARE the correct coefficients but for a different model.)

florence briton 2015-5-19

Well, you edited your answer while I was writing, so couldn't see it... Anyway the problem is solved, I didn't see that the regression was centered and normalized.

Thank you for everything and most of all for being so polite

请先登录，再进行评论。