How to calculate the hat matrix for a linear model?

14 次查看(过去 30 天)
I am interested in calculating the hat matrix (H) for a linear regression model, so that I can get the leverage values from the diagonal of H. My independent variables are contained in X, which is a 101x5 matrix where values range from 0 to 1. I tried calculating H two different ways, and got different answers. First, I just manually calculated H using the definition of the hat matrix:
X*(inv(transpose(X)*X))*transpose(X)
Next, I obtained the hat matrix by creating the linear regression model using fitlm for my X (101x5) and Y (101x1) data:
mdl = fitlm(X,Y)
After fitting, I looked at mdl.Diagnostics.HatMatrix and found that the generated hat matrix values were different compared to when I calculated them manually using the formula above. Does fitlm perform some special scaling of the X matrix during the fitting process that causes the discrepancy? I would like to know why there is a difference, and which hat matrix is actually the one I want. I will be writing a script to calculate leverages for many different models and would like to know which hat matrix calculation method to use.

回答(1 个)

fred  ssemwogerere
Hello, computations of the Hat matrix from predictors (observations) and the targets-fitted model values are expected to present differences, but not significant enough to cause any model fitting discrepancies. However, the observations-derived (Hx) Hat matrix is more of an initial estimate of the model derived Hat matrix (mdl.Diagnostics.HatMatrix). As such, i think it is preferable to use the Hat matrix derived from the model for subsequent computations.
For any more information about this subject you could also refer to:
Regards,
Fred

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by