Choose a Regression Function
Regression is the process of fitting models to data. The models must have numerical responses. For models with categorical responses, see Parametric Classification or Supervised Learning Workflow and Algorithms. The regression process depends on the model. If a model is parametric, regression estimates the parameters from the data. If a model is linear in the parameters, estimation is based on methods from linear algebra that minimize the norm of a residual vector. If a model is nonlinear in the parameters, estimation is based on search methods from optimization that minimize the norm of a residual vector.
This table describes which function to use depending on the type of regression problem.
Model Components | Result of Regression | Function to Use |
---|---|---|
Continuous or categorical predictors, continuous response, linear model | Fitted model coefficients | fitlm . See Linear Regression. |
Continuous or categorical predictors, continuous response, linear model of unknown complexity | Fitted model and fitted coefficients | stepwiselm . See Stepwise Regression. |
Continuous or categorical predictors, response possibly with restrictions such as nonnegative or integer-valued, generalized linear model | Fitted generalized linear model coefficients | fitglm or stepwiseglm . See Generalized Linear Models. |
Continuous predictors with a continuous nonlinear response, parametrized nonlinear model | Fitted nonlinear model coefficients | fitnlm . See Nonlinear Regression. |
Continuous predictors, continuous response, linear model | Set of models from ridge, lasso, or elastic net regression | lasso or ridge . See Lasso and Elastic Net or Ridge Regression. |
Correlated continuous predictors, continuous response, linear model | Fitted model and fitted coefficients | plsregress . See Partial Least Squares. |
Continuous or categorical predictors, continuous response, unknown model | Nonparametric model | fitrtree or fitrensemble . |
Categorical predictors only | ANOVA | anova , anova1 , anova2 , anovan . |
Continuous predictors, multivariable response, linear model | Fitted multivariate regression model coefficients | mvregress |
Continuous predictors, continuous response, mixed-effects model | Fitted mixed-effects model coefficients | nlmefit or nlmefitsa . See Mixed-Effects Models. |
Update Legacy Code with New Fitting Methods
There are several Statistics and Machine Learning Toolbox™ functions for performing regression. The following sections describe how to replace calls to older functions to new versions:
regress
into fitlm
Previous Syntax:
[b,bint,r,rint,stats] = regress(y,X)
where X
contains a column of ones.
Current Syntax:
mdl = fitlm(X,y)
where you do not add a column of ones to X
.
Equivalent values of the previous outputs:
b
—mdl.Coefficients.Estimate
bint
—coefCI
(mdl)
r
—mdl.Residuals.Raw
rint
— There is no exact equivalent. Try examiningmdl.Residuals.Studentized
to find outliers.stats
—mdl
contains various properties that replace components ofstats
.
regstats
into fitlm
Previous Syntax:
stats = regstats(y,X,model,whichstats)
Current Syntax:
mdl = fitlm(X,y,model)
Obtain statistics from the properties and methods of the LinearModel
object (mdl
). For example, see the mdl.Diagnostics
and mdl.Residuals
properties.
robustfit
into fitlm
Previous Syntax:
[b,stats] = robustfit(X,y,wfun,tune,const)
Current Syntax:
mdl = fitlm(X,y,'robust','on') % bisquare
Or to use the wfun
weight and the tune
tuning parameter:
opt.RobustWgtFun = 'wfun'; opt.Tune = tune; % optional mdl = fitlm(X,y,'robust',opt)
Obtain statistics from the properties and methods of the LinearModel
object (mdl
). For example, see the mdl.Diagnostics
and mdl.Residuals
properties.
stepwisefit
into stepwiselm
Previous Syntax:
[b,se,pval,inmodel,stats,nextstep,history] = stepwisefit(X,y,Name,Value)
Current Syntax:
mdl = stepwiselm(ds,modelspec,Name,Value)
or
mdl = stepwiselm(X,y,modelspec,Name,Value)
Obtain statistics from the properties and methods of the LinearModel
object (mdl
). For example, see the mdl.Diagnostics
and mdl.Residuals
properties.
glmfit
into fitglm
Previous Syntax:
[b,dev,stats] = glmfit(X,y,distr,param1,val1,...)
Current Syntax:
mdl = fitglm(X,y,distr,...)
Obtain statistics from the properties and methods of the GeneralizedLinearModel
object (mdl
). For example, the deviance is mdl.Deviance
, and to compare mdl
against a constant model, use devianceTest
(mdl)
.
nlinfit
into fitnlm
Previous Syntax:
[beta,r,J,COVB,mse] = nlinfit(X,y,fun,beta0,options)
Current Syntax:
mdl = fitnlm(X,y,fun,beta0,'Options',options)
Equivalent values of the previous outputs:
beta
—mdl.Coefficients.Estimate
r
—mdl.Residuals.Raw
covb
—mdl.CoefficientCovariance
mse
—mdl.mse
mdl
does not provide the Jacobian (J
) output. The primary purpose of J
was to pass it into nlparci
or nlpredci
to obtain confidence intervals for the estimated coefficients (parameters) or predictions. Obtain those confidence intervals as:
parci = coefCI(mdl) [pred,predci] = predict(mdl)