Main Content

kpsstest

KPSS test for stationarity

Description

h = kpsstest(y) returns rejection decision from conducting the Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) test for a unit root in the input univariate time series.

example

[h,pValue,stat,cValue] = kpsstest(y) also returns the p-value pValue, test statistic stat, and critical value cValue of the test.

example

StatTbl = kpsstest(Tbl) returns a table containing variables for the test results, statistics, and settings from conducting the KPSS test for a unit root in the last variable of the input table or timetable Tbl. To select a different variable in Tbl to test, use the DataVariable name-value argument.

example

[___] = kpsstest(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. kpsstest returns the output argument combination for the corresponding input arguments.

Some options control the number of tests to conduct. The following conditions apply when kpsstest conducts multiple tests:

  • kpsstest treats each test as separate from all other tests.

  • If you specify y, all outputs are vectors.

  • If you specify Tbl, each row of StatTbl contains the results of the corresponding test.

For example, kpsstest(Tbl,DataVariable="GDP",Alpha=0.025,Lags=[0 1]) conducts two tests, at a level of significance of 0.025, for the presence of a unit root in the variable GDP of the table Tbl. The first test includes 0 autocovariance lags in the Newey-West estimator of the long-run variance and the second test includes 1 autocovariance lag.

example

[___,reg] = kpsstest(___) additionally returns a structure of regression statistics for the hypothesis test reg.

example

Examples

collapse all

Test a time series for a unit root using the default options of kpsstest. Input the time series data as a numeric vector.

Load the Nelson-Plosser macroeconomic series data set. Plot the real gross national product (RGNP).

load Data_NelsonPlosser
rgnp = DataTable.GNPR;
dt = datetime(dates,ConvertFrom="datenum");

plot(dt,rgnp)
title("Real Gross National Product")

Figure contains an axes object. The axes object with title Real Gross National Product contains an object of type line.

The series exhibits exponential growth.

Linearize the RGNP series.

linRGNP = log(rgnp);

Assess the null hypothesis of the KPSS test, which is that the series is trend stationary. Use default options.

h = kpsstest(linRGNP)
h = logical
   1

h = 1 indicates that, at a 5% level of significance, the test rejects the null hypothesis that the linearized Real GNP series is trend stationary, which suggests that the series is unit root nonstationary.

Load the Nelson-Plosser Macroeconomic series data set, and linearize the RGNP series.

load Data_NelsonPlosser
linRGNP = log(DataTable.GNPR);

Assess the null hypothesis that the series is trend stationary. Return the test decision, p-value, test statistic, and critical value.

[h,pValue,stats,cValue] = kpsstest(linRGNP)
h = logical
   1

pValue = 
0.0100
stats = 
0.6299
cValue = 
0.1460

Test whether a time series, which is one variable in a table, is trend stationary using the default options.

Load the Nelson-Plosser macroeconomic series data set, which contains annual measurements of macroeconomic variables in the table DataTable. Linearize the RGNP series by applying the log transformation, and store the result in DataTable.

load Data_NelsonPlosser
DataTable.LinRGNP = log(DataTable.GNPR);
DataTable.Properties.VariableNames{end}
ans = 
'LinRGNP'

Test the null hypothesis that the linearized RGNP series is trend stationary.

StatTbl = kpsstest(DataTable)
StatTbl=1×7 table
                h      pValue     stat      cValue    Lags    Alpha    Trend
              _____    ______    _______    ______    ____    _____    _____

    Test 1    true      0.01     0.62989    0.146      0      0.05     true 

kpsstest returns test results and settings in the table StatTbl, where variables correspond to test results (h, pValue, stat, and cValue) and settings (Lags, Alpha, Trend), and rows correspond to individual tests (in this case, kpsstest conducts one test).

By default, kpsstest tests the last variable in the table. To select a variable from an input table to test, set the DataVariable option.

Conduct multiple tests on the linearized RGNP series that reproduce the first row of the second half of Table 5 in [2].

Load the Nelson-Plosser macroeconomic series data set, which contains annual measurements of macroeconomic variables in the table DataTable. Apply the log transformation to all variables in the table.

load Data_NelsonPlosser
LogDT = varfun(@log,DataTable);
LogDT.Properties.VariableNames{end}
ans = 
'log_SP'

varfun applies log to all variables in DataTable, prepends log_ to all transformed variable names, and stores the result in the table LogDT. The final variable is the log of the stock price index series (SP).

Assess the null hypothesis that the linearized RGNP series is trend stationary over a range of lags. Specify the variable name of the linearized RGNP series log_GNPR.

lags = (0:8);
StatTbl = kpsstest(LogDT,DataVariable="log_GNPR",Lags=lags)
StatTbl=9×7 table
                h       pValue      stat      cValue    Lags    Alpha    Trend
              _____    ________    _______    ______    ____    _____    _____

    Test 1    true         0.01    0.62989    0.146      0      0.05     true 
    Test 2    true         0.01    0.33666    0.146      1      0.05     true 
    Test 3    true         0.01    0.24209    0.146      2      0.05     true 
    Test 4    true       0.0169     0.1976    0.146      3      0.05     true 
    Test 5    true     0.027579    0.17291    0.146      4      0.05     true 
    Test 6    true      0.04015    0.15782    0.146      5      0.05     true 
    Test 7    true     0.048417     0.1479    0.146      6      0.05     true 
    Test 8    false     0.05886    0.14122    0.146      7      0.05     true 
    Test 9    false    0.066757    0.13695    0.146      8      0.05     true 

The tests corresponding to 0 lags 2 produce p-values that are less than 0.01. For 2 < lags < 7, the tests indicate sufficient evidence to suggest that log RGNP is unit root nonstationary (as opposed to the series being trend stationary) at the default 5% level.

Test whether the wage series in the manufacturing sector (1900–1970) has a unit root. Use the advice in [2] to select the number of lags in the Newey-West estimator of the coefficient standard errors.

Load the Nelson-Plosser macroeconomic data set. Remove all missing values from the data relative to the wage series WN.

load Data_NelsonPlosser
[DataTable,idx] = rmmissing(DataTable,DataVariables="WN");
dt = dates(~idx);

Compute the effective sample size T and its square root, where the latter is approximately the number of lags recommended for the Newey-West estimator.

T = height(DataTable);
sqrtT = sqrt(T);

Plot the wage series.

plot(dt,DataTable.WN)
title("Wages")

Figure contains an axes object. The axes object with title Wages contains an object of type line.

The wage series appears to grow exponentially.

Linearize the wages series by applying the log transformation to all variables in the table.

LogDT = varfun(@log,DataTable);
plot(dt,LogDT.log_WN)
title("Log Wages")

Figure contains an axes object. The axes object with title Log Wages contains an object of type line.

The log wage series appears to have a linear trend.

Test the null hypothesis that the log wage series is trend stationary (no unit root) against the alternative hypothesis that the log wage series is difference stationary. Conduct the test by setting a range of lags for the Newey-West estimator around T.

StatTbl = kpsstest(LogDT,DataVariable="log_WN",Lags=7:10)
StatTbl=4×7 table
                h      pValue      stat      cValue    Lags    Alpha    Trend
              _____    ______    ________    ______    ____    _____    _____

    Test 1    false     0.1       0.10678    0.146       7     0.05     true 
    Test 2    false     0.1       0.10074    0.146       8     0.05     true 
    Test 3    false     0.1      0.096634    0.146       9     0.05     true 
    Test 4    false     0.1      0.094058    0.146      10     0.05     true 

All tests fail to reject the null hypothesis that the log wages series is trend stationary.

The p-values are larger than 0.1. The software compares the test statistic to critical values and computes p-values that it interpolates from tables in [2].

Load the Nelson-Plosser macroeconomic series data set. Apply the log transformation to all variables in the table.

load Data_NelsonPlosser
LogDT = varfun(@log,DataTable);

Assess the null hypothesis that the linearized RGNP series is trend stationary. Use the Trend option to conduct the test with (true) and without (false) a deterministic time trend term in the response model. Return the regression statistics.

[~,reg] = kpsstest(LogDT,DataVariable="log_GNPR",Trend=[true false]);

reg is a structure array of length 2 with fields that store the OLS regression results. Each element corresponds to a test.

Compare the coefficient estimates.

withTrend = reg(1).coeff
withTrend = 2×1

    4.5834
    0.0310

woTrend = reg(2).coeff
woTrend = 
5.5595

For the first test, the response model for the regression includes a trend term, so the regression coefficients withTrend include a model intercept (under the null hypothesis) 4.5834 and the coefficient of the time trend 0.0310. For the second test, the response model includes an intercept only for the regression, so the intercept woTrend is 5.5595.

Display the coefficient standard errors for the first test.

reg(1).se
ans = 2×1

    0.0344
    0.0010

The Lags option includes autocovariance lags in the Newey-West estimator of the long-run variance. Therefore, the option does not affect the estimated OLS coefficients, standard errors, or MSE.

Conduct a KPSS test for each lag from 0 through 4. Compare the standard OLS and the Newey-West estimates.

lags = 0:4;
[~,regLags] = kpsstest(LogDT,DataVariable="log_GNPR",Lags=lags);

coeffs = table(regLags.coeff,VariableNames="Lags_"+lags, ...
    RowNames=["Intercept" "Trend"]);
se = table(regLags.se,VariableNames="Lags_"+lags, ...
    RowNames=["SE_Intercept" "SE_Trend"]);
mse = table(regLags.MSE,VariableNames="Lags_"+lags, ...
    RowNames="MSE");
nw = table(regLags.NWEst,VariableNames="Lags_"+lags, ...
    RowNames="NWVar");
[coeffs; se; mse; nw]
ans=6×5 table
                      Lags_0        Lags_1        Lags_2        Lags_3        Lags_4  
                    __________    __________    __________    __________    __________

    Intercept           4.5834        4.5834        4.5834        4.5834        4.5834
    Trend             0.030988      0.030988      0.030988      0.030988      0.030988
    SE_Intercept       0.03443       0.03443       0.03443       0.03443       0.03443
    SE_Trend        0.00095035    0.00095035    0.00095035    0.00095035    0.00095035
    MSE               0.017933      0.017933      0.017933      0.017933      0.017933
    NWVar             0.017354       0.03247      0.045154      0.055321      0.063222

Input Arguments

collapse all

Univariate time series data, specified as a numeric vector. Each element of y represents an observation.

Data Types: double

Time series data, specified as a table or timetable. Each row of Tbl is an observation.

Specify a single series (variable) to test by using the DataVariable argument. The selected variable must be numeric.

Note

kpsstest removes missing observations, represented by NaN values, from the input series.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: kpsstest(Tbl,DataVariable="GDP",Alpha=0.025,Lags=[0 1]) conducts two tests, at a level of significance of 0.025, for the presence of a unit root in the variable GDP of the table Tbl. The first test includes 0 autocovariance lags in the Newey-West estimator of the long-run variance and the second test includes 1 autocovariance lag.

Number of autocovariance lags to include in the Newey-West estimator of the long-run variance, specified as a nonnegative integer or vector of nonnegative integers. If Lags(j) > 0, kpsstest includes lags 1 through Lags(j) in the estimator for test j.

kpsstest conducts a separate test for each element in Lags.

Example: Lags=0:2 includes zero lagged autocovariance terms in the Newey-West estimator for the first test, the lag 1 autocovariance term for the second test, and autocovariance lags 1 and 2 in the third test.

Data Types: double

Flag for including deterministic trend δt in the model, specified as a logical scalar or vector.

kpsstest conducts a separate test for each element in Trend.

Example: Trend=false excludes δt from the response model for all tests.

Data Types: logical

Significance level for the hypothesis test, specified as a numeric scalar or vector with entries between 0.01 and 0.10.

kpsstest conducts a separate test for each element in Alpha.

Example: Alpha=[0.01 0.05] uses a level of significance of 0.01 for the first test, and then uses a level of significance of 0.05 for the second test.

Data Types: double

Variable in Tbl to test, specified as a string scalar or character vector containing a variable name in Tbl.Properties.VariableNames, or an integer or logical vector representing the index of a name. The selected variable must be numeric.

Example: DataVariable="GDP"

Example: DataVariable=[false true false false] or DataVariable=2 tests the second table variable.

Data Types: double | logical | char | string

Note

  • When kpsstest conducts multiple tests, the function applies all single settings (scalars or character vectors) to each test.

  • All vector-valued specifications that control the number of tests must have equal length.

  • If you specify the vector y and any value is a row vector, all outputs are row vectors.

Output Arguments

collapse all

Test rejection decisions, returned as a logical scalar or vector with length equal to the number of tests. kpsstest returns h when you supply the input y.

  • Values of 1 indicate rejection of the trend-stationary null hypothesis in favor of the unit root alternative.

  • Values of 0 indicate failure to reject the trend-stationary null hypothesis.

Test statistic p-values, returned as a numeric scalar or vector with length equal to the number of tests. kpsstest returns pValue when you supply the input y.

The p-values are right-tail probabilities.

When test statistics are outside tabulated critical values, kpsstest returns maximum (0.10) or minimum (0.01) p-values.

Test statistics, returned as a numeric scalar or vector with length equal to the number of tests. kpsstest returns stat when you supply the input y.

kpsstest computes test statistics by using an ordinary least squares (OLS) regression (for more details, see KPSS Test).

  • If you set Trend=false, kpsstest regresses y on an intercept.

  • Otherwise, kpsstest regresses y on an intercept and trend term.

Critical values, returned as a numeric scalar or vector with length equal to the number of tests. kpsstest returns cValue when you supply the input y.

Critical values are for right-tail probabilities.

Test summary, returned as a table with variables for the outputs h, pValue, stat, and cValue, and with a row for each test. kpsstest returns StatTbl when you supply the input Tbl.

StatTbl contains variables for the test settings specified by Lags, Alpha, and Trend.

Regression statistics for OLS estimation of the coefficients in the model, returned as a structure array with the number of records equal to the number of tests.

Each element of reg has the fields in this table. You can access a field using dot notation, for example, reg(1).coeff contains the coefficient estimates of the first test.

FieldDescription
numLength of input series with NaNs removed
sizeEffective sample size T, adjusted for lags
namesRegression coefficient names
coeffEstimated coefficient values
seEstimated coefficient standard errors
CovEstimated coefficient covariance matrix
tStatst statistics of coefficients and p-values
FStatF statistic and p-value
yMuMean of the lag-adjusted input series
ySigmaStandard deviation of the lag-adjusted input series
yHatFitted values of the lag-adjusted input series
resRegression residuals
autoCovEstimated residual autocovariances
NWEstNewey-West coefficient standard error estimates
DWStatDurbin-Watson statistic
SSRRegression sum of squares
SSEError sum of squares
SSTTotal sum of squares
MSEMean square error
RMSEStandard error of the regression
RSqR2 statistic
aRSqAdjusted R2 statistic
LLLoglikelihood of data under Gaussian innovations
AICAkaike information criterion
BICBayesian (Schwarz) information criterion
HQCHannan-Quinn information criterion

More About

collapse all

Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Test

The KPSS test assesses the null hypothesis that a univariate time series is trend stationary against the alternative that it is a nonstationary unit root process.

The test uses the structural model

yt=ct+δt+u1tct=ct1+u2t,

where

  • δ is the trend coefficient (see the Trend argument).

  • u1t is a stationary process.

  • u2t is an independent and identically distributed process with mean 0 and variance σ2.

The null hypothesis is that σ2 = 0, which implies that the random walk term (ct) is constant and acts as the model intercept. The alternative hypothesis is that σ2 > 0, which introduces the unit root in the random walk.

An OLS regression of yt onto Xt yields the residual series {et}, where Xt has one of the following forms:

  • Xt = 1 for all t when Trend is false.

  • Xt = [1 δt] when Trend is true.

The test statistic is

t=1TSt2s2T2,

where

  • T is the effective sample size.

  • s2 is the Newey-West estimate of the long-run variance.

  • sT = e1 + e2 + … + eT.

Tips

  • To draw valid inferences from a KPSS test, you must determine a suitable value for the Lags argument. The following methods can determine a suitable number of lags:

    • Begin with a small number of lags, and then evaluate the sensitivity of the results by adding more lags.

    • Kwiatkowski et al. [2] suggest that a number of lags on the order of T, where T is the effective sample size, is often satisfactory under both the null and the alternative.

    For consistency of the Newey-West estimator, the number of lags must approach infinity as the sample size increases.

  • With a specific testing strategy in mind, determine the value of the Trend argument by the growth characteristics of the input time series.

    • If the input series grows, include a trend term by setting Trend to true (default). This setting provides a reasonable comparison of a trend stationary null and a unit root process with drift.

    • If a series does not exhibit long-term growth characteristics, exclude a trend term by setting Trend to false.

Algorithms

  • Test statistics follow nonstandard distributions under the null, even asymptotically. Kwiatkowski et al. [2] use Monte Carlo simulations, for models with and without a trend, to tabulate asymptotic critical values for a standard set of significance levels between 0.01 and 0.1. kpsstest interpolates critical values and p-values from these tables.

References

[1] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[2] Kwiatkowski, D., P. C. B. Phillips, P. Schmidt, and Y. Shin. “Testing the Null Hypothesis of Stationarity against the Alternative of a Unit Root.” Journal of Econometrics. Vol. 54, 1992, pp. 159–178.

[3] Newey, W. K., and K. D. West. "A Simple, Positive Semidefinite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix." Econometrica. Vol. 55, 1987, pp. 703–708.

Version History

Introduced in R2009b