What Is Panel Data?
Panel data is a set of observations on multiple subjects collected over time. Examples of panel data include data collected on individuals, households, firms, municipalities, states, or countries over the same time period.
There are two types of panel data:
- Balanced (complete) panel data comprises all observations for each individual measured at the same time points. Example: Economic data from countries or states collected yearly for 10 years.
- Unbalanced (incomplete) panel data comprises missing observations for some individuals for certain time points. Example: Financial data from firms or individuals where some firms or individuals are older than others.
Most statistical analyses are performed on so-called cross-sectional data, which is collected at one point in time. By contrast, panel data analysis extends statistical analyses of cross-sectional data over multiple time points by fitting panel regression models that account for both cross-section effects and time effects. These methods give more reliable parameter estimates compared to linear regression models.
Popular methods for panel data analysis include multivariate regression and linear mixed-effects models. Panel regression models differ in how they account for cross-sectional and time effects.
- Panel data fixed-effect models or least squares with dummy variables (LSDV) models: Cross-sectional effects are modeled using dummy variables.
- One-way random-effects models: Cross-sectional effects, but not time effects, are modeled as random effects.
- Two-way random-effects models: Both cross-section effects and time effects are modeled as random effects.
- Nested (hierarchical) models: Nested groupings in cross-section data (for example, states nested in countries) are modeled as random effects.
MATLAB® supports common estimation methods for panel data regression models, including:
Key Points
- Longitudinal data is common in econometrics, biostatistics (such as drug development), and sociology.
- Popular methods for panel data analysis include multivariate regression and linear mixed-effects models.
For more information on how to fit various panel data regression models, see Statistics and Machine Learning Toolbox™, Financial Toolbox™, and Econometrics Toolbox™ for use with MATLAB.
Examples and How To
Software Reference
See also: Statistics and Machine Learning Toolbox, Econometrics Toolbox, Financial Toolbox, linear model, linear regression, Predictive Modeling
Data Science
Use MATLAB for Big Data, Machine Learning and Production Analytics Systems.