Why is fitlm (or regess) and estimation using mathematical equations giving different results?

3 次查看(过去 30 天)
I am trying to estimate the linear regression coefficients from mathematical equations. But I get different results using standard function and the mathematical equation which is β = inverse(X'X)X'Y. But I get different results. Why does that happen?
Here is the code:
% X = input data
% Y = outcome
% Using the fitlm command to estiamte the multiple liner regression model
lin_mdl = fitlm(X,Y);
b1 = lin_mdl.Coefficients.Estimate;
% Using the regress command to estiamte the multiple liner regression model
X1= [ones(size(X,1),1) X];
b2 = regress(Y,X1)
% Using mathematical equation
b3 = inv(X1'*X1)*X1'*Y;
% Comparing the coefficients
[b1 b2 b3]
And the output is:
ans =
1.0e+05 *
0.0002 0.0002 -5.6828
-0.0000 -0.0000 -0.0758
0.0000 0.0000 -0.0092
-0.0001 -0.0001 -0.1538
-0.0000 -0.0000 -0.0023
-0.0000 -0.0000 -0.2201
0.0000 0.0000 0.4286
0.0000 0.0000 -0.0009
0.0000 0.0000 0.1575
-0.0000 -0.0000 -0.3488
0.0000 0.0000 0.0040
-0.0000 -0.0000 -0.0057
0 0 -7.1398
0.0014 0.0014 -0.5267
0.0004 0.0004 0.0004
-0.0001 -0.0001 -0.0001
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
-0.0000 -0.0000 -0.0000
-0.0000 -0.0000 -0.0000
0.0000 0.0000 0.0000
-0.0002 -0.0002 -0.0002
-0.0002 -0.0002 -0.0002
-0.0002 -0.0002 -0.0002
-0.0001 -0.0001 -0.0001
-0.0001 -0.0001 -0.0001
-0.0002 -0.0002 -0.0002
-0.0003 -0.0003 -0.0003
-0.0002 -0.0002 -0.0002
-0.0003 -0.0003 -0.0003
-0.0003 -0.0003 -0.0003
-0.0003 -0.0003 -0.0003
-0.0001 -0.0001 -0.0001
-0.0001 -0.0001 -0.0001
0.0000 0.0000 0.0000
0.0000 0.0000 0.0000
0.0001 0.0001 0.0001
0.0001 0.0001 0.0001
0.0001 0.0001 0.0001
-0.0002 -0.0002 -0.0002
-0.0001 -0.0001 -0.0001
-0.0001 -0.0001 -0.0001
-0.0002 -0.0002 -0.0002
-0.0002 -0.0002 -0.0002
-0.0001 -0.0001 -0.0001
-0.0003 -0.0003 -0.0003
Now the outputs by the mathematical equation is different from fitlm (or regress) function. Why is that? The correlation matrix as obtained by command corr(X) can be visualized as follows:

回答(1 个)

Tom Lane
Tom Lane 2016-2-23
The first two columns of coefficients have what appear to be exact zeros in row 13, corresponding to column 12 of X because of the constant. I suggest you try fitting a model with column 12 of X as the output (response) variable and the rest of X as the input (predictor) variables. I suspect you will find that column 12 is very close to an exact linear function of some set of other columns.
Inverting X'*X is notoriously ill-conditioned. Another way to do this is b=X1\Y, which is in principle the same thing but better conditioned.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by