Specify ARIMA Error Model Innovation Distribution

About the Innovation Process

A regression model with ARIMA errors has the following general form:

\begin{matrix} y_{t} = c + X_{t} β + u_{t} \\ a (L) A (L) {(1 - L)}^{D} (1 - L^{s}) u_{t} = b (L) B (L) ε_{t}, \end{matrix}

(1)

where

t = 1,...,T.
y_t is the response series.
X_t is row t of X, which is the matrix of concatenated predictor data vectors. That is, X_t is observation t of each predictor series.
c is the regression model intercept.
β is the regression coefficient.
u_t is the disturbance series.
ε_t is the innovations series.
$L^{j} y_{t} = y_{t - j} .$
$a (L) = (1 - a_{1} L - ... - a_{p} L^{p}),$ which is the degree p, nonseasonal autoregressive polynomial.
$A (L) = (1 - A_{1} L - ... - A_{p_{s}} L^{p_{s}}),$ which is the degree p_s, seasonal autoregressive polynomial.
${(1 - L)}^{D},$ which is the degree D, nonseasonal integration polynomial.
$(1 - L^{s}),$ which is the degree s, seasonal integration polynomial.
$b (L) = (1 + b_{1} L + ... + b_{q} L^{q}),$ which is the degree q, nonseasonal moving average polynomial.
$B (L) = (1 + B_{1} L + ... + B_{q_{s}} L^{q_{s}}),$ which is the degree q_s, seasonal moving average polynomial.

Suppose that the unconditional disturbance series (u_t) is a stationary stochastic processes. Then, you can express the second equation in Equation 1 as

$u_{t} = a^{- 1} (L) A^{- 1} (L) {(1 - L)}^{- D} {(1 - L^{s})}^{- 1} b (L) B (L) ε_{t} = Ψ (L) ε_{t},$

where Ψ(L) is an infinite degree lag operator polynomial [2].

The innovation process (ε_t) is an independent and identically distributed (iid), mean 0 process with a known distribution. Econometrics Toolbox™ generalizes the innovation process to ε_t = σz_t, where z_t is a series of iid random variables with mean 0 and variance 1, and σ² is the constant variance of ε_t.

regARIMA models contain two properties that describe the distribution of ε_t:

Variance stores σ².
Distribution stores the parametric form of z_t.

Innovation Distribution Options

The default value of Variance is NaN, meaning that the innovation variance is unknown. You can assign a positive scalar to Variance when you specify the model using the name-value pair argument 'Variance',sigma2 (where sigma2 = σ²), or by modifying an existing model using dot notation. Alternatively, you can estimate Variance using estimate.
You can specify the following distributions for z_t (using name-value pair arguments or dot notation):
- Standard Gaussian
- Standardized Student’s t with degrees of freedom ν > 2. Specifically,
  
  $z_{t} = T_{ν} \sqrt{\frac{ν - 2}{ν}},$
  where T_ν is a Student’s t distribution with degrees of freedom ν > 2.
The t distribution is useful for modeling innovations that are more extreme than expected under a Gaussian distribution. Such innovation processes have excess kurtosis, a more peaked (or heavier tailed) distribution than a Gaussian. Note that for ν > 4, the kurtosis (fourth central moment) of T_ν is the same as the kurtosis of the Standardized Student’s t (z_t), i.e., for a t random variable, the kurtosis is scale invariant.
Tip
It is good practice to assess the distributional properties of the residuals to determine if a Gaussian innovation distribution (the default distribution) is appropriate for your model.

Specify Innovation Distribution

Open Live Script

regARIMA stores the distribution (and degrees of freedom for the t distribution) in the Distribution property. The data type of Distribution is a struct array with potentially two fields: Name and DoF.

If the innovations are Gaussian, then the Name field is Gaussian, and there is no DoF field. regARIMA sets Distribution to Gaussian by default.
If the innovations are t-distributed, then the Name field is t and the DoF field is NaN by default, or you can specify a scalar that is greater than 2.

To illustrate specifying the distribution, consider this regression model with AR(2) errors:

$\begin{array}{rcl} y_{t} & = & c + X_{t} β + u_{t} \\ u_{t} & = & α_{1} u_{t - 1} + α_{2} u_{t - 2} + ε_{t} \end{array}$

Mdl = regARIMA(2,0,0);
Mdl.Distribution

ans = struct with fields:
    Name: "Gaussian"

By default, Distribution property of Mdl is a struct array with the field Name having the value Gaussian.

If you want to specify a t innovation distribution, then you can either specify the model using the name-value pair argument 'Distribution','t', or use dot notation to modify an existing model.

Specify the model using the name-value pair argument.

Mdl = regARIMA('ARLags',1:2,'Distribution','t');
Mdl.Distribution

ans = struct with fields:
    Name: "t"
     DoF: NaN

If you use the name-value pair argument to specify the t innovation distribution, then the default degrees of freedom is NaN.

You can use dot notation to yield the same result.

Mdl = regARIMA(2,0,0);
Mdl.Distribution = 't'

Mdl = 
  regARIMA with properties:

     Description: "ARMA(2,0) Error Model (t Distribution)"
      SeriesName: "Y"
    Distribution: Name = "t", DoF = NaN
       Intercept: NaN
            Beta: [1×0]
               P: 2
               Q: 0
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
        Variance: NaN

If the innovation distribution is $t_{10}$ , then you can use dot notation to modify the Distribution property of the existing model Mdl. You cannot modify the fields of Distribution using dot notation, e.g., Mdl.Distribution.DoF = 10 is not a value assignment. However, you can display the value of the fields using dot notation.

Mdl.Distribution = struct('Name','t','DoF',10)

Mdl = 
  regARIMA with properties:

     Description: "ARMA(2,0) Error Model (t Distribution)"
      SeriesName: "Y"
    Distribution: Name = "t", DoF = 10
       Intercept: NaN
            Beta: [1×0]
               P: 2
               Q: 0
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
        Variance: NaN

tDistributionDoF = Mdl.Distribution.DoF

tDistributionDoF = 
10

Since the DoF field is not a NaN, it is an equality constraint when you estimate Mdl using estimate.

Alternatively, you can specify the $t_{10}$ innovation distribution using the name-value pair argument.

Mdl = regARIMA('ARLags',1:2,'Intercept',0,...
    'Distribution',struct('Name','t','DoF',10))

Mdl = 
  regARIMA with properties:

     Description: "ARMA(2,0) Error Model (t Distribution)"
      SeriesName: "Y"
    Distribution: Name = "t", DoF = 10
       Intercept: 0
            Beta: [1×0]
               P: 2
               Q: 0
              AR: {NaN NaN} at lags [1 2]
             SAR: {}
              MA: {}
             SMA: {}
        Variance: NaN

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Wold, H. A Study in the Analysis of Stationary Time Series. Uppsala, Sweden: Almqvist & Wiksell, 1938.

Related Examples

More About

Regression Models with Time Series Errors