fitlm
Description
specifies options using one or more name-value arguments in addition to any of the input
argument combinations in the previous syntaxes. For example, you can specify robust fitting
options and observations to exclude from the fit.mdl
= fitlm(___,Name=Value
)
Examples
Fit Linear Regression Model
Generate a D-optimal design and some response data for the design points.
dopt = optimalDOE(5,20); pts = dopt.Design; h = height(pts); response = 2*pts.Factor1+3*pts.Factor2+pts.Factor3+0.01*randn(h,1);
dopt
is an optimalDOE
object that contains information about the generated D-optimal design. response
is a vector of response data.
Fit a linear model using the design points in dopt
as the predictor data and response
as the response data.
mdl = fitlm(dopt,response)
mdl = Linear regression model: y ~ 1 + Factor1 + Factor2 + Factor3 + Factor4 + Factor5 Estimated Coefficients: Estimate SE tStat pValue ___________ _________ ________ __________ (Intercept) -0.00045125 0.001709 -0.26404 0.7956 Factor1 2.0012 0.001709 1171 2.4234e-36 Factor2 2.9954 0.001709 1752.7 8.5516e-39 Factor3 0.99767 0.0017443 571.97 5.5036e-32 Factor4 -0.0020275 0.0017443 -1.1624 0.26452 Factor5 0.0016909 0.001709 0.9894 0.33926 Number of observations: 20, Error degrees of freedom: 14 Root Mean Squared Error: 0.00764 R-squared: 1, Adjusted R-Squared: 1 F-statistic vs. constant model: 9.57e+05, p-value = 3.27e-38
mdl
is a LinearModel
object that contains the results of fitting a linear model to the data. The model display includes the model formula, estimated coefficients, and model summary statistics.
Specify Model for Linear Regression
Generate a mixture design and create some response data for the design points.
dmix = mixtureDOE(3); pts = dmix.Design; h = height(pts); Y = 2*pts.Factor1+pts.Factor2.*pts.Factor3+5+0.001*randn(h,1);
dmix
is a mixtureDOE
object that contains information about the generated mixture design. The vector Y
contains response data.
Fit a linear model using the design points in dmix
as the predictor data and Y
as the response data. Specify the experiment model to fit.
mdl = fitlm(dmix,Y,"y~Factor1+Factor2:Factor3")
mdl = Linear regression model: y ~ 1 + Factor1 + Factor2:Factor3 Estimated Coefficients: Estimate SE tStat pValue ________ __________ ______ __________ (Intercept) 4.9999 0.00096562 5177.9 8.3471e-15 Factor1 2.0009 0.0017544 1140.5 3.5465e-12 Factor2:Factor3 0.99501 0.0067547 147.31 1.2739e-08 Number of observations: 7, Error degrees of freedom: 4 Root Mean Squared Error: 0.00148 R-squared: 1, Adjusted R-Squared: 1 F-statistic vs. constant model: 7e+05, p-value = 8.16e-12
mdl
is a LinearModel
object that contains the results of fitting the experiment model to the data. The values in the pValue
column suggest that each term in the model has a statistically significant effect on the response.
Fit Linear Regression Model Without Intercept
Generate a full factorial design and create some response data for the design points.
dff = fullFactorialDOE(3); pts = dff.Design; h = height(pts); Y = 2*pts.Factor1+3*pts.Factor2+pts.Factor3+0.01*randn(h,1);
dff
is a fullFactorialDOE
object that contains information about the generated full factorial design. The vector Y
contains response data.
Fit a linear model using the design points in dff
as the predictor data and Y
as the response data.
mdl1 = fitlm(dff,Y)
mdl1 = Linear regression model: y ~ 1 + Factor1 + Factor2 + Factor3 Estimated Coefficients: Estimate SE tStat pValue ___________ _________ _________ __________ (Intercept) -0.00013126 0.0051434 -0.025521 0.98086 Factor1 1.9974 0.0051434 388.34 2.6379e-10 Factor2 2.9964 0.0051434 582.57 5.209e-11 Factor3 1.0045 0.0051434 195.29 4.1244e-09 Number of observations: 8, Error degrees of freedom: 4 Root Mean Squared Error: 0.0145 R-squared: 1, Adjusted R-Squared: 1 F-statistic vs. constant model: 1.76e+05, p-value = 1.07e-10
mdl1
is a LinearModel
object that contains the results of fitting a linear model to the data. The model display includes the model formula, estimated coefficients, and model summary statistics. The large p-value for the intercept term indicates it does not have a statistically significant effect on the response.
Fit a linear model without an intercept term to the data.
mdl2 = fitlm(dff,Y,Intercept=0)
mdl2 = Linear regression model: y ~ Factor1 + Factor2 + Factor3 Estimated Coefficients: Estimate SE tStat pValue ________ _________ ______ __________ Factor1 1.9974 0.0046008 434.15 1.2305e-12 Factor2 2.9964 0.0046008 651.28 1.6198e-13 Factor3 1.0045 0.0046008 218.32 3.8258e-11 Number of observations: 8, Error degrees of freedom: 5 Root Mean Squared Error: 0.013
mdl2
contains the results of fitting a linear model without an intercept term.
Inspect the loglikelihood of each model.
loglikelihoods = [mdl1.LogLikelihood,mdl2.LogLikelihood]
loglikelihoods = 1×2
25.2636 25.2629
The output shows that mdl1
has a slightly larger loglikelihood than mdl2
. This result indicates that removing the intercept term does not have a significant effect on how well the model fits the data.
Input Arguments
dobj
— Design
fullFactorialDOE
object | mixtureDOE
object | optimalDOE
object
Design, specified as a fullFactorialDOE
, mixtureDOE
, or
optimalDOE
object. fitlm
fits the linear regression model using the
design points in dobj.Design
as predictors.
Y
— Response variable
numeric vector
Response variable, specified as a p-by-1 numeric vector, where
p is the number of design points in dobj
. Each
entry in Y
is the response for the corresponding row of
dobj.Design
.
Data Types: single
| double
modelspec
— Experiment model
string scalar | character vector | terms matrix
Experiment model, specified as one of the following values.
A character vector or string scalar with the model name.
Value Model Description "linear"
The model contains an intercept and linear term for each factor. "constant"
The model contains only a constant (intercept) term. "interactions"
The model contains an intercept, linear term for each factor, and all products of pairs of distinct factors (no squared terms). "purequadratic"
The model contains an intercept term, and linear and squared terms for each factor. "quadratic"
The model contains an intercept term, linear and squared terms for each factor, and all products of pairs of distinct factors. "scheffe-linear"
The model contains a linear term for each factor and does not include an intercept term.
"scheffe-quad"
The model is given by the formula:
"scheffe-special-cubic"
The model is given by the formula:
"poly
ijk
"The model is a polynomial with all terms up to degree i
in the first factor, degreej
in the second factor, and so on. Specify the maximum degree for each factor by using numerals 0 though 9. The model contains interaction terms, but the degree of each interaction term does not exceed the maximum value of the specified degrees. For example,"poly13"
has an intercept and x1, x2, x22, x23, x1*x2, and x1*x22 terms, where x1 and x2 are the first and second factors, respectively.In the above table, each xi corresponds to the ith factor in the design, and bi, bij, bijk, and dij are coefficients for the model terms.
A character vector or string scalar formula in Wilkinson Notation. The factor names in the formula must be factor names specified by the
FactorNames
name-value argument when you createdobj
.A t-by-n terms matrix, where t is the number of terms and n is the number of factors in the design. A terms matrix is convenient when the number of factors is large and you want to generate the terms programmatically. For more information about terms matrices, see Terms Matrix.
The default value for modelspec
is
dobj.ModelSpecification
.
Example: "quadratic"
Example: "x1 + x2^2 + x1:x2"
Data Types: single
| double
| char
| string
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: fitlm(dobj,Y,Intercept=false,ResponseVar="OxygenLevel")
fits
a linear model without an intercept to the predictor data dobj
and the
response variable "OxygenLevel"
in Y
.
Exclude
— Observations to exclude
numeric vector | logical vector
Observations to exclude from the fit, specified as a numeric or logical vector.
The elements of the vector indicate which rows in dobj.Design
to
exclude from the fit.
Example: Exclude=[2,3]
Example: Exclude=logical([0 1 1 0 0 0])
Data Types: single
| double
| logical
Intercept
— Indicator for constant term
true
or 1
| false
or 0
Indicator for the constant term (intercept) in the fit, specified as a numeric or
logical 1
(true
) to include the constant term,
or 0
(false
) to remove the constant term from
the model. The default value for intercept is true
if
dobj.ModelSpecification
contains an intercept. Otherwise, the
default value is false
.
You can specify Intercept
only when
modelspec
is a model name.
Example: Intercept=false
Data Types: logical
ResponseVar
— Response variable name
string | character vector
Response variable name, specified as a string or a character vector. The default
value for ResponseVar
is "y"
.
Example: ResponseVar="yield"
Data Types: char
| string
RobustOpts
— Type of robust fitting
"off"
(default) | "on"
| character vector | string scalar | structure
Type of robust fitting to use, specified as one of these values:
"off"
— No robust fitting.fitlm
uses ordinary least squares."on"
— Robust fitting using the"bisquare"
weight function with the default tuning constant.Character vector or string scalar — Name of a robust fitting weight function from the following table.
fitlm
uses the corresponding default tuning constant.Structure with the two fields
RobustWgtFun
andTune
.The
RobustWgtFun
field contains the name of a robust fitting weight function from the following table, or the function handle of a custom weight function.The
Tune
field contains a tuning constant. If you do not set theTune
field,fitlm
uses the corresponding default tuning constant.
Weight Function | Description | Default Tuning Constant |
---|---|---|
"andrews" | w = (abs(r)<pi) .* sin(r) ./ r | 1.339 |
"bisquare" | w = (abs(r)<1) .* (1 - r.^2).^2 (also called
biweight) | 4.685 |
"cauchy" | w = 1 ./ (1 + r.^2) | 2.385 |
"fair" | w = 1 ./ (1 + abs(r)) | 1.400 |
"huber" | w = 1 ./ max(1, abs(r)) | 1.345 |
"logistic" | w = tanh(r) ./ r | 1.205 |
"ols" | Ordinary least squares (no weight function) | None |
"talwar" | w = 1 * (abs(r)<1) | 2.795 |
"welsch" | w = exp(-(r.^2)) | 2.985 |
function handle | Custom weight function that accepts a vector r of
scaled residuals, and returns a vector of weights the same size as
r | 1 |
The default tuning constants of built-in weight functions give coefficient estimates that are approximately 95% as statistically efficient as the ordinary least-squares estimates, provided that the response has a normal distribution with no outliers. Decreasing the tuning constant increases the downweight assigned to large residuals. Increasing the tuning constant decreases the downweight assigned to large residuals.
The value r in the weight functions is determined by
r = resid/(tune*s*sqrt(1–h))
,
where resid
is the vector of residuals from the previous
iteration, tune
is the tuning constant, and h
is
the vector of leverage values from a least-squares fit. s
is an
estimate of the standard deviation of the error term given by
s = MAD/0.6745
.
MAD
is the median absolute deviation of the residuals from
their median. The constant 0.6745 makes the estimate unbiased for the normal
distribution. If X
has p columns, the software
excludes the smallest p absolute deviations when computing the
median.
For robust fitting,
fitlm
uses M-estimation to formulate estimating
equations, and solves them using the method Iteratively Reweighted Least Squares (IRLS).
Example: RobustOpts="andrews"
Data Types: char
| string
| struct
Weights
— Observation weights
ones(p,1)
(default) | p-by-1 vector of nonnegative scalar values
Observation weights, specified as a p-by-1 vector of nonnegative scalar values, where p is the number of design points.
Data Types: single
| double
Output Arguments
mdl
— Fitted model
LinearModel
object
Fitted model, returned as a LinearModel
object.
If you do not set the RobustOpts
name-value argument, or
specify it as "ols"
, the model is a least-squares fit. Otherwise,
fitlm
fits the model using the robust fitting function
specified by RobustOpts
.
More About
Terms Matrix
A terms matrix T
is a
t-by-n matrix specifying the terms in a model,
where t is the number of terms, and n is the number of
factors in the design. The value of T(i,j)
is the exponent of variable
j
in term i
.
For example, suppose that a design includes three factors x1
,
x2
, and x3
. Each row of T
represents one term:
[0 0 0]
— Constant term or intercept[0 1 0]
—x2
; equivalently,x1^0 * x2^1 * x3^0
[1 0 1]
—x1*x3
[2 0 0]
—x1^2
[0 1 2]
—x2*(x3^2)
Wilkinson Notation
Wilkinson notation describes the terms in a model. The notation relates to the terms included in the model, not to the multipliers (coefficients) of those terms.
Wilkinson notation uses these symbols:
+
means include the next variable.–
means do not include the next variable.:
defines an interaction, which is a product of the terms.*
defines an interaction and all lower order terms.^
raises the predictor to a power, exactly as in*
repeated, so^
includes lower order terms as well.()
groups the terms.
This table shows typical examples of Wilkinson notation.
Wilkinson Notation | Terms in Standard Notation |
---|---|
1 | Constant (intercept) term |
x1^k , where k is a positive
integer | x1 ,
x12 , ...,
x1k |
x1 + x2 | x1 , x2 |
x1*x2 | x1 , x2 ,
x1*x2 |
x1:x2 | x1*x2 only |
–x2 | Do not include x2 |
x1*x2 + x3 | x1 , x2 , x3 ,
x1*x2 |
x1 + x2 + x3 + x1:x2 | x1 , x2 , x3 ,
x1*x2 |
x1*x2*x3 – x1:x2:x3 | x1 , x2 , x3 ,
x1*x2 , x1*x3 ,
x2*x3 |
x1*(x2 + x3) | x1 , x2 , x3 ,
x1*x2 , x1*x3 |
For more details, see Wilkinson Notation.
Version History
Introduced in R2024b
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)