Linear Mixed-Effects Models
Linear mixed-effects models are extensions of linear regression models for data that are collected and summarized in groups. These models describe the relationship between a response variable and independent variables, with coefficients that can vary with respect to one or more grouping variables. A mixed-effects model consists of two parts, fixed effects and random effects. Fixed-effects terms are usually the conventional linear regression part, and the random effects are associated with individual experimental units drawn at random from a population. The random effects have prior distributions whereas fixed effects do not. Mixed-effects models can represent the covariance structure related to the grouping of data by associating the common random effects to observations that have the same level of a grouping variable. The standard form of a linear mixed-effects model is
where
y is the n-by-1 response vector, and n is the number of observations.
X is an n-by-p fixed-effects design matrix.
β is a p-by-1 fixed-effects vector.
Z is an n-by-q random-effects design matrix.
b is a q-by-1 random-effects vector.
ε is the n-by-1 observation error vector.
The assumptions for the linear mixed-effects model are:
Random-effects vector, b, and the error vector, ε, have the following prior distributions:
where D is a symmetric and positive semidefinite matrix, parameterized by a variance component vector θ, I is an n-by-n identity matrix, and σ2 is the error variance.
Random-effects vector, b, and the error vector, ε, are independent from each other.
Mixed-effects models are also called multilevel models or hierarchical models depending on the context. Mixed-effects models is a more general term than the latter two. Mixed-effects models might include factors that are not necessarily multilevel or hierarchical, for example crossed factors. That is why mixed-effects is the terminology preferred here. Sometimes mixed-effects models are expressed as multilevel regression models (first level and grouping level models) that are fit simultaneously. For example, a varying or random intercept model, with one continuous predictor variable x and one grouping variable with M levels, can be expressed as
where yim corresponds to data for observation i and group m, n is the total number of observations, and b0m and εim are independent of each other. After substituting the group-level parameters in the first-level model, the model for the response vector becomes
A random intercept and slope model with one continuous predictor variable x, where both the intercept and slope vary independently by a grouping variable with M levels is
or
You might also have correlated random effects. In general, for a model with a random intercept and slope, the distribution of the random effects is
where D is a 2-by-2 symmetric and positive semidefinite matrix, parameterized by a variance component vector θ.
After substituting the group-level parameters in the first-level model, the model for the response vector is
If you express the group-level variable, xim, in the random-effects term by zim, this model is
In this case, the same terms appear in both the fixed-effects design matrix and random-effects design matrix. Each zim and xim correspond to the level m of the grouping variable.
It is also possible to explain more of the group-level variations by adding more group-level predictor variables. A random-intercept and random-slope model with one continuous predictor variable x, where both the intercept and slope vary independently by a grouping variable with M levels, and one group-level predictor variable vm is
This model results in main effects of the group-level predictor and an interaction term between the first-level and group-level predictor variables in the model for the response variable as
The term β11vmxim is often called a cross-level interaction in many textbooks on multilevel models. The model for the response variable y can be expressed as
which corresponds to the standard form given earlier,
In general, if there are R grouping variables, and m(r,i) shows the level of grouping variable r, for observation i, then the model for the response variable for observation i is
where β is a p-by-1 fixed-effects vector, b(r)m(r,i) is a q(r)-by-1 random-effects vector for the rth grouping variable and level m(r,i), and εi is a 1-by-1 error term for observation i.
References
[1] Pinherio, J. C., and D. M. Bates. Mixed-Effects Models in S and S-PLUS. Statistics and Computing Series, Springer, 2004.
[2] Hariharan, S. and J. H. Rogers. “Estimation Procedures for Hierarchical Linear Models.” Multilevel Modeling of Educational Data (A. A. Connell and D. B. McCoach, eds.). Charlotte, NC: Information Age Publishing, Inc., 2008.
[3] Hox, J. Multilevel Analysis, Techniques and Applications. Lawrence Erlbaum Associates, Inc., 2002
[4] Snidjers, T. and R. Bosker. Multilevel Analysis. Thousand Oaks, CA: Sage Publications, 1999.
[5] Gelman, A. and J. Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York, NY: Cambridge University Press, 2007.
See Also
LinearMixedModel
| fitlme
| fitlmematrix