Multicollinearity / Collinearity Problem
13 次查看(过去 30 天)
显示 更早的评论
Hello,
I am running regressions (using regstats) in MATLAB and experiencing the singular matrix error. I know for a fact that I did NOT put redundant variables in the model, e.g. a constant plus a full set of dummy variables (I removed the constant using the design matrix specification).
This leaves the possibility of some collinearity between the variables in the model (I have 115 regressors and 8760 observations). What's puzzling to me is that regstats does not handle these kinds of issues automatically. Statistical packages such as Stata, RATS and others automatically drop collinear variables and proceed with estimating the regression. Clearly, identifying what is causing the collinearity would be a pain in a model with 115 variables.
My question #1: does MATLAB have a way of automatically identifying the variable(s) generating multicollinearity and correcting the regressor list to deal with this problem?
A related question that I have is that there are cases when I want to drop variables from the model (e.g. variable 73 out of 115) for a particular model run, but I don't want to cut vector #73 from the matrix because that would then make the estimated regression coefficients shift, e.g. regression coefficient for vector 74 would now become regresults.beta(73), up until regresults.beta(115) that would now become regresults.beta(114). In most statistical packages, one can simply populate the relevant vector in the independent variable matrix with zeros, without having to re-aling the beta vector elements that one wishes to call in later calculations. MATLAB gives me a singular matrix error when I try to populate a vector in the explanatory variable matrix with zeros. Is there a different way to achieve the same effect in MATLAB?
Thank you in advance for your help!
Best regards,
Marek Kolodziej
0 个评论
采纳的回答
the cyclist
2011-3-23
I expect you could use the RANK command to get at your collinearity issue.
This will not be as automatic as you would like, but maybe you could do some manual exploration.
0 个评论
更多回答(1 个)
Andrew Newell
2011-3-23
Partial Least Squares Regression with cross-validation may be the answer to your collinearity problem. To deal with the second problem, one approach is to use dataset arrays (see Removing Observations from Dataset Arrays further down the page).
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Gaussian Process Regression 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!