linfitregsel - term selection in linear regression

版本 1.0.0.0 (7.7 KB) 作者: Bill Whiten

Select terms in linear regression (equation fitting) using eigenvectors and term significance

关注

0.0

(0)

486.0 次下载

更新时间 2013/2/28

查看许可证

linfitregsel selects relevant terms in linear regression and calculate the regression equation. The value of x is calculated to minimise the sum of squares of A*x-b. A combination of first rejecting small eigenvalues with their associated eigenvectors, and then removing insignificant terms is used, until both the eigenvalue and significance criteria are satisfied. This may gives a different result from stepwise or all subsets term selection (which usually also give different results). In particular correlated terms are handled very differently, giving an equation that balances correlated terms. As the coefficient calculation is different from normal regression the variance matrix of the result is also different.

The function is used as:
[x,vm,info]=linfitregsel(A,b,optn)
Where A is the matrix of predictor (independent) variables (including a unit column if a constant term is required), and b is a column of values to be predicted (dependent values) using A. The optional argument (optn) can be used to change default values:
optn.reg is the ratio of the largest eigenvalues below which eigenvalues are rejected (default 0.01).
optn.treg is the ratio of coefficient values to their standard errors below which the term is excluded from the regression. This can be given as a vector of increasing values to allow for changes in term significance as the equation is refined. These values can be chosen using Student-t probabilities (Default [0.5,1.0,2.0]).
optn.sel allows the forcing of terms out of, or into, the regression (default all columns available for selection).
Outputs are:
x is the result of the regression, with no rejection of terms it is A\b, rejected terms are set to zero.
vm is the variance matrix of x, in particular sqrt(diag(vm)) gives the standard errors of x, and sqrt(a’*vm*a) is the standard error of a’*x. Entries corresponding to rejected terms are zero. If no terms are rejected vm is inv(A’*A)*info.sdr.
info.nreg is the number of eigenvectors used in the regresion.
info.sdr is the standard deviation of the residuals.

The accuracy and usefulness of the regression equation depends on the experimental design, range covered by the predictor matrix, and the accuracy of the data. The experimental design should cover the range of interest and the ratio of the extreme eigenvalues of A should not be small (possibly after normalisation of the data).
Withholding part of the data for validation is strongly recommended.
As this regression method allows for correlated predictors, by calculating an equation that balances the different predictors it may give more robust predictions than ordinary linear regression using the same predictors.

引用格式

Bill Whiten (2025). linfitregsel - term selection in linear regression (https://www.mathworks.com/matlabcentral/fileexchange/40566-linfitregsel-term-selection-in-linear-regression), MATLAB Central File Exchange. 检索时间: 2025/4/3.

MATLAB 版本兼容性

创建方式 R2012b

兼容任何版本

平台兼容性

Windows macOS Linux

类别

AI and Statistics > Statistics and Machine Learning Toolbox > Regression > Linear Regression >

在 Help Center 和 MATLAB Answers 中查找有关 Linear Regression 的更多信息

标签添加标签

致谢

参考作品: Optional function arguments, Restore project status for selected project

启发作品: Greyboxbuild: complete a greybox model

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

linfitregsel/

版本	已发布	发行说明
1.0.0.0	2013/2/28		下载

linfitregsel - term selection in linear regression

引用格式

必需项

MATLAB 版本兼容性

平台兼容性

类别

标签添加标签

致谢

Community Treasure Hunt

探索实时编辑器

linfitregsel/

linfitregsel - term selection in linear regression

引用格式

必需项

MATLAB 版本兼容性

平台兼容性

类别

标签 添加标签

致谢

Community Treasure Hunt

探索实时编辑器

linfitregsel/

标签添加标签