forecast
Forecast responses of univariate regression model with ARIMA time series errors
Syntax
Description
[
returns the
Y
,YMSE
]
= forecast(Mdl
,numperiods
)numperiods
-by-1 numeric vector of consecutive forecasted responses
Y
and the corresponding numeric vector of forecast mean square errors
(MSE) YMSE
of the fully specified, univariate regression model with
ARIMA time series errors Mdl
.
[
also
forecasts a Y
,YMSE
,U
]
= forecast(Mdl
,numperiods
)numperiods
-by-1 numeric vector of unconditional
disturbances U
.
[___] = forecast(___,
specifies options using one or more name-value arguments in
addition to any of the input argument combinations in previous syntaxes.
Name=Value
)forecast
returns the output argument combination for the
corresponding input arguments. For example, forecast(Mdl,10,Y0=y0,X0=Pred0,XF=Pred)
specifies the presample response path y0
, and the presample and
forecast sample predictor data Pred0
and Pred
,
respectively, to forecast a model with a regression component.
returns the table or timetable Tbl
= forecast(Mdl
,numperiods
,Presample=Presample
,PresampleRegressionDisturbanceVariable=PresampleRegressionDisturbanceVariable
)Tbl
containing a variable for each of
the paths of response, forecast MSE, and unconditional disturbance series resulting from
forecasting the regression model with ARIMA errors Mdl
over a
numperiods
forecast horizon. Presample
is a
table or timetable containing presample unconditional disturbance data in the variable
specified by PresampleRegressionDisturbanceVariable
. Alternatively,
Presample
can contain presample error model innovation data in the
variable specified by PresampleInnovationVariable
or a combination of
presample response and predictor data in the variables specified by
PresampleResponseVariable
and
PresamplePredictorVariables
. You can specify either alternative
instead of PresampleRegressionDisturbanceVariable
using name-value
syntax; forecast
infers presample unconditional disturbance data
from either alternative specification. (since R2023b)
specifies the variables Tbl
= forecast(Mdl
,numperiods
,InSample=InSample
,PredictorVariables=PredictorVariables
)PredictorVariables
in the in-sample table or
timetable of data InSample
containing the predictor data for the
model regression component. (since R2023b)
specifies presample unconditional disturbance data to initialize the error model and
in-sample predictor data for the regression component. You can choose different presample
data from Tbl
= forecast(Mdl
,numperiods
,Presample=Presample
,PresampleRegressionDisturbanceVariable=PresampleRegressionDisturbanceVariable
,InSample=InSample
,PredictorVariables=PredictorVariables
)Presample
when it is applicable. (since R2023b)
uses additional options specified by one or more name-value arguments, using any input
argument combination in the previous three syntaxes. (since R2023b)Tbl
= forecast(___,Name=Value
)
For example,
forecast(Mdl,20,Presample=PSTbl,PresampleResponseVariables="GDP",PresamplePredictorVariables="CPI",InSample=Tbl,PredictorVariables="CPI")
returns a timetable containing variables for the forecasted responses, forecast MSE, and
forecasted unconditional disturbance paths, forecasted 20 periods into the future.
forecast
initializes the model by using the presample response
and predictor data in the GDP
and CPI
variables of
the timetable PSTbl
. forecast
applies the
predictor data in the PredictorVariables
variables of the table or
timetable Tbl
to the model regression component.
Examples
Forecast Vector of Responses from Regression Model with ARIMA Errors
Return a vector of responses, forecasted over a 30-period horizon, from the following regression model with ARMA(2,1) errors:
where is Gaussian with variance 0.1.
Specify the model. Simulate responses from the model and two predictor series.
Mdl0 = regARIMA(Intercept=0,AR={0.5 -0.8},MA=-0.5, ... Beta=[0.1; -0.2],Variance=0.1); rng(1,"twister"); % For reproducibility T = 130; numperiods = 30; Pred = randn(T,2); y = simulate(Mdl0,T,X=Pred);
Fit the model to the first 100 observations, and reserve the remaining 30 observations to evaluate forecast performance.
Mdl = regARIMA(2,0,1); estidx = 1:(T-numperiods); % Estimation sample indices fhidx = (T-numperiods+1):T; % Forecast horizon EstMdl = estimate(Mdl,y(estidx),X=Pred(estidx,:));
Regression with ARMA(2,1) Error Model (Gaussian Distribution): Value StandardError TStatistic PValue _________ _____________ __________ __________ Intercept 0.0074068 0.012554 0.58999 0.5552 AR{1} 0.55422 0.087265 6.351 2.1391e-10 AR{2} -0.78361 0.080794 -9.6988 3.0499e-22 MA{1} -0.46483 0.1394 -3.3345 0.00085446 Beta(1) 0.092779 0.024497 3.7873 0.00015228 Beta(2) -0.17339 0.021143 -8.2008 2.3874e-16 Variance 0.073721 0.011006 6.6984 2.1066e-11
EstMdl
is a new regARIMA
model containing the estimates. The estimates are close to their true values.
Use EstMdl
to forecast a 30-period horizon.
[yF,yMSE] = forecast(EstMdl,numperiods,Y0=y(estidx), ...
X0=Pred(estidx,:),XF=Pred(fhidx,:));
yF
is a 30-by-1 vector of forecasted responses and yMSE
is a 30-by-1 vector of corresponding forecast MSEs. To initialize the model for forecasting, forecast
infers required presample unconditional disturbances from the specified presample response and predictor data.
Visually compare the forecasts to the holdout data using a plot.
figure plot(y,Color=[.7,.7,.7]); hold on plot(fhidx,yF,"b",LineWidth=2); plot(fhidx,yF + 1.96*sqrt(yMSE),"r:",LineWidth=2); plot(fhidx,yF - 1.96*sqrt(yMSE),"r:",LineWidth=2); h = gca; ph = patch([repmat(T-numperiods+1,1,2) repmat(T,1,2)], ... [h.YLim fliplr(h.YLim)],[0 0 0 0],"b"); ph.FaceAlpha = 0.1; legend("Observed","Forecast","95% forecast interval", ... Location="best"); title("30-Period Forecasts and 95% Forecast Intervals") axis tight hold off
Many observations in the holdout sample fall beyond the 95% forecast intervals. Two reasons for this are:
The predictors are randomly generated in this example.
estimate
treats the predictors as fixed. The 95% forecast intervals based on the estimates fromestimate
do not account for the variability in the predictors.By shear chance, the estimation period seems less volatile than the forecast period.
estimate
uses the less volatile estimation period data to estimate the parameters. Therefore, forecast intervals based on the estimates should not cover observations that have an underlying innovations process with larger variability.
Forecast GDP Using Regression Model with ARMA Errors
Forecast stationary, log GDP using a regression model with ARMA(1,1) errors, including CPI as a predictor.
Fit a regression model with ARMA(1,1) errors by regressing the US gross domestic product (GDP) growth rate onto consumer price index (CPI) quarterly changes. Forecast the model into a 2-year (8-quarter) horizon. Supply a timetable of data and specify the series for the fit.
Load and Transform Data
Load the US macroeconomic data set. Compute the series of GDP quarterly growth rates and CPI quarterly changes.
load Data_USEconModel DTT = price2ret(DataTimeTable,DataVariables="GDP"); DTT.GDPRate = 100*DTT.GDP; DTT.CPIDel = diff(DataTimeTable.CPIAUCSL); T = height(DTT)
T = 248
figure tiledlayout(2,1) nexttile plot(DTT.Time,DTT.GDPRate) title("GDP Rate") ylabel("Percent Growth") nexttile plot(DTT.Time,DTT.CPIDel) title("Index")
The series appear stationary, albeit heteroscedastic.
Prepare Timetable for Estimation
When you plan to supply a timetable, you must ensure it has all the following characteristics:
The selected response variable is numeric and does not contain any missing values.
The timestamps in the
Time
variable are regular, and they are ascending or descending.
Remove all missing values from the timetable.
DTT = rmmissing(DTT); T_DTT = height(DTT)
T_DTT = 248
Because each sample time has an observation for all variables, rmmissing
does not remove any observations.
Determine whether the sampling timestamps have a regular frequency and are sorted.
areTimestampsRegular = isregular(DTT,"quarters")
areTimestampsRegular = logical
0
areTimestampsSorted = issorted(DTT.Time)
areTimestampsSorted = logical
1
areTimestampsRegular = 0
indicates that the timestamps of DTT
are irregular. areTimestampsSorted = 1
indicates that the timestamps are sorted. Macroeconomic series in this example are timestamped at the end of the month. This quality induces an irregularly measured series.
Remedy the time irregularity by shifting all dates to the first day of the quarter.
dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt; areTimestampsRegular = isregular(DTT,"quarters")
areTimestampsRegular = logical
1
DTT
is regular.
Create Model Template for Estimation
Suppose that a regression model of CPI quarterly changes onto the GDP rate, with ARMA(1,1) errors, is appropriate.
Create a model template for a regression model with ARMA(1,1) errors template. Specify the response variable name.
Mdl = regARIMA(1,0,1);
Mdl.SeriesName = "GDPRate";
Mdl
is a partially specified regARIMA
object.
Partiton Data
Partition the data set into estimation and forecast samples.
fh = 8; DTTES = DTT(1:(T_DTT-fh),:); DTTFS = DTT((T_DTT-fh+1):end,:);
Fit Model to Data
Fit a regression model with ARMA(1,1) errors to the estimation sample. Specify the entire series GDP rate and CPI quarterly changes series, and specify the predictor variable name.
EstMdl = estimate(Mdl,DTTES,PredictorVariables="CPIDel");
Regression with ARMA(1,1) Error Model (Gaussian Distribution): Value StandardError TStatistic PValue __________ _____________ __________ __________ Intercept 0.016489 0.0017307 9.5272 1.6152e-21 AR{1} 0.57835 0.096952 5.9653 2.4415e-09 MA{1} -0.15125 0.11658 -1.2974 0.19449 Beta(1) 0.0025095 0.0014147 1.7738 0.076089 Variance 0.00011319 7.5405e-06 15.01 6.2792e-51
EstMdl
is a fully specified, estimated regARIMA
object. By default, estimate
backcasts for the required Mdl.P = 1
presample regression model residual and sets the required Mdl.Q = 1
presample error model residual to 0.
Forecast Estimated Model
Forecast the GDP rate over a 8-quarter horizon. Use the estimation sample as a presample for the forecast.
Tbl = forecast(EstMdl,fh,Presample=DTTES,PresampleResponseVariable="GDPRate", ... PresamplePredictorVariables="CPIDel",InSample=DTTFS, ... PredictorVariables="CPIDel")
Tbl=8×7 timetable
Time Interval GDP GDPRate CPIDel GDPRate_Response GDPRate_MSE GDPRate_RegressionInnovation
_____ ________ ___________ __________ ______ ________________ ___________ ____________________________
Q2-07 91 0.00018278 0.018278 1.675 0.015765 0.00011319 -0.0049278
Q3-07 91 0.00016916 0.016916 1.359 0.01705 0.00013383 -0.00285
Q4-07 94 6.1286e-05 0.0061286 3.355 0.02326 0.00014074 -0.0016483
Q1-08 91 9.3272e-05 0.0093272 1.93 0.020379 0.00014305 -0.00095329
Q2-08 91 0.00011103 0.011103 3.367 0.024387 0.00014382 -0.00055134
Q3-08 92 8.9585e-05 0.0089585 1.641 0.020288 0.00014408 -0.00031887
Q4-08 92 -0.00016145 -0.016145 -7.098 -0.0015075 0.00014417 -0.00018442
Q1-09 90 -8.6878e-05 -0.0086878 1.137 0.019236 0.0001442 -0.00010666
Tbl
is a 8-by-7 timetable containing the forecasted responses GDPRate_Response
and their forecast MSEs GDPRate_MSE
, the forecasted unconditional disturbances GDPRate_RegressionInnovation
, and all variables in DTTFS
.
Plot the forecasts and 95% forecast intervals.
Tbl.Lower = Tbl.GDPRate_Response - 1.96*sqrt(Tbl.GDPRate_MSE); Tbl.Upper = Tbl.GDPRate_Response + 1.96*sqrt(Tbl.GDPRate_MSE); figure h1 = plot(DTT.Time(end-65:end),DTT.GDPRate(end-65:end), ... Color=[.7,.7,.7]); hold on h2 = plot(Tbl.Time,Tbl.GDPRate_Response,"b",LineWidth=2); h3 = plot(Tbl.Time,Tbl.Lower,"r:",LineWidth=2); plot(DTTFS.Time,Tbl.Upper,"r:",LineWidth=2); ha = gca; title("GDP Rate Forecasts and 95% Forecast Intervals") ph = patch([repmat(Tbl.Time(1),1,2) repmat(Tbl.Time(end),1,2)],... [ha.YLim fliplr(ha.YLim)],... [0 0 0 0],"b"); ph.FaceAlpha = 0.1; legend([h1 h2 h3],["Observed GDP rate" "Forecasted GDP rate", ... "95% forecast interval"],Location="best") axis tight hold off
Forecast Regression Model with ARIMA Errors With Known Intercept
Fit a regression model with ARIMA(1,1,1) errors by regressing the quarterly log US GDP onto the log CPI. Compute MMSE forecasts of the log GDP series using the estimated model. Supply data in timetables.
Load the US macroeconomic data set. Compute the log GDP series.
load Data_USEconModel
DTT = DataTimeTable;
DTT.LogGDP = log(DTT.GDP);
T = height(DTT);
Remedy the time irregularity by shifting all dates to the first day of the quarter.
dt = DTT.Time; dt = dateshift(dt,"start","quarter"); DTT.Time = dt;
Reserve 2 years (8 quarters) of data at the end of the series to compare against the forecasts.
numperiods = 8; DTTES = DTT(1:(T-numperiods),:); % Estimation sample DTTFS = DTT((T-numperiods+1):T,:); % Forecast horizon
Suppose that a regression model of the quarterly log GDP on CPI, with ARMA(1,1) errors, is appropriate.
Create a model template for a regression model with ARMA(1,1) errors template. Specify the response variable name.
Mdl = regARIMA(1,1,1);
Mdl.SeriesName = "LogGDP";
The intercept is not identifiable in a regression model with integrated errors. Fix its value before estimation. One way to do this is to estimate the intercept using simple linear regression. Use the estimation sample.
coeff = [ones(T-numperiods,1) DTTES.CPIAUCSL]\DTTES.LogGDP; Mdl.Intercept = coeff(1);
Consider performing a sensitivity analysis by using a grid of intercepts.
Reserve 2 years (8 quarters) of data at the end of the series to compare against the forecasts.
numperiods = 8; estidx = 1:(T-numperiods); % Estimation sample frstHzn = (T-numperiods+1):T; % Forecast horizon
Fit a regression model with ARMA(1,1,1) errors to the estimation sample. Specify the predictor variable name.
EstMdl = estimate(Mdl,DTTES,PredictorVariables="CPIAUCSL");
Regression with ARIMA(1,1,1) Error Model (Gaussian Distribution): Value StandardError TStatistic PValue __________ _____________ __________ ___________ Intercept 5.8303 0 Inf 0 AR{1} 0.92869 0.028414 32.684 2.6126e-234 MA{1} -0.39063 0.057599 -6.7819 1.1858e-11 Beta(1) 0.0029335 0.0014645 2.0031 0.045166 Variance 0.00010668 6.9256e-06 15.403 1.554e-53
EstMdl
is a fully specified, estimated regARIMA
object. By default, estimate
backcasts for the required Mdl.P = 2
presample regression model residual and sets the required Mdl.Q = 1
presample error model residual to 0.
Infer estimation sample unconditional disturbances to initialize the model for forecasting. Specify the predictor variable name.
Tbl0 = infer(EstMdl,DTTES,PredictorVariables="CPIAUCSL");
Forecast the estimated model over an 8-quarter horizon. Use the inferred unconditional disturbances as presample data. Specify the forecast sample predictor data and its variable name, and specify the presample unconditional disturbance variable name.
Tbl = forecast(EstMdl,numperiods,Presample=Tbl0, ... PresampleRegressionDisturbanceVariable="LogGDP_RegressionResidual", ... InSample=DTTFS,PredictorVariables="CPIAUCSL");
Plot the forecasted log GDP with approximate 95% forecast intervals. Also, separately plot the unconditional disturbances.
Tbl.Lower = Tbl.LogGDP_Response - 1.96*sqrt(Tbl.LogGDP_MSE); Tbl.Upper = Tbl.LogGDP_Response + 1.96*sqrt(Tbl.LogGDP_MSE); figure tiledlayout(2,1) nexttile plot(DTT.Time(end-40:end),DTT.LogGDP(end-40:end),Color=[.7,.7,.7]) hold on h1 = plot(Tbl.Time,[Tbl.Lower Tbl.Upper],"r:",LineWidth=2); h2 = plot(Tbl.Time,Tbl.LogGDP_Response,"k",LineWidth=2); h = gca; ph = patch([repmat(Tbl.Time(1),1,2) repmat(Tbl.Time(end),1,2)], ... [h.YLim fliplr(h.YLim)],[0 0 0 0],"b"); ph.FaceAlpha = 0.1; legend([h1(1) h2],["95% percentile intervals" "MMSE forecast"], ... Location="northwest") axis tight grid on title("Log GDP Forecast Over 2-year Horizon") hold off nexttile plot(DTT.Time,[Tbl0.LogGDP_RegressionResidual; Tbl.LogGDP_RegressionInnovation]) hold on h = gca; ph = patch([repmat(Tbl.Time(1),1,2) repmat(Tbl.Time(end),1,2)], ... [h.YLim fliplr(h.YLim)],[0 0 0 0],"b"); ph.FaceAlpha = 0.1; axis tight grid on title("Unconditional Disturbances") hold off
The unconditional disturbances, , are nonstationary, therefore the widths of the forecast intervals grow with time.
Input Arguments
numperiods
— Forecast horizon
positive integer
Forecast horizon, or the number of time points in the forecast period, specified as a positive integer.
Data Types: double
Presample
— Presample data
table | timetable
Since R2023b
Presample data containing presample responses
yt, predictors
xt, unconditional disturbances
ut, or error model innovations
εt, to initialize the model, specified as
a table or timetable with numprevars
variables and
numpreobs
rows. You can select a response, error model innovation,
unconditional disturbance, or multiple predictor variables from
Presample
by using the
PresampleResponseVariable
,
PresampleErrorInnovationVariable
,
PresampleRegressionDisturbanceVariable
, or
PresamplePredictorVariables
name-value argument,
respectively.
numpreobs
is the number of presample observations.
numpaths
is the maximum number of independent presample paths among
the specified variables, from which forecast
initializes the
resulting numpaths
forecasts (see Algorithms).
For all selected variables except predictor variables, each variable contains a
single path (numpreobs
-by-1 vector) or multiple paths
(numpreobs
-by-numpaths
matrix) of presample
response, error model innovation, or unconditional disturbance data.
Each selected predictor variable contains a single path of observations.
forecast
applies all selected predictor variables to each
forecasted path.
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
forecast
uses only the latest required rows. For more details,
see Time Base Partitions for Forecasting.
Presample unconditional disturbances ut are required to initialize the error model for forecasting. You can specify presample unconditional disturbances in one of the following ways:
Specify
numpreobs
≥Mdl.P
presample response and predictor data to enableforecast
to infer presample unconditional disturbances.Specify
numpreobs
≥Mdl.P
presample unconditional disturbances without presample error model innovations.forecast
ignores specified presample response and predictor data.Specify
numpreobs
≥Mdl.Q
presample error model innovations without presample unconditional disturbances.forecast
ignores specified presample response and predictor data.Specify
numpreobs
≥max(Mdl.P,Mdl.Q)
presample error model innovations and unconditional disturbances only.forecast
ignores specified presample response and predictor data.
If Presample
is a timetable, all the following conditions must
be true:
Presample
must represent a sample with a regular datetime time step (seeisregular
).The datetime vector of sample timestamps
Presample.Time
must be ascending or descending.
If Presample
is a table, the last row contains the latest
presample observation.
By default, forecast
sets all necessary presample
unconditional disturbances in one of the following ways:
If
forecast
cannot infer enough unconditional disturbances from specified presample response and predictor data,forecast
sets all necessary presample unconditional disturbances to zero.If you specify at least
Mdl.P + Mdl.Q
presample unconditional disturbances,forecast
infers all necessary presample error model innovations from the specified presample unconditional disturbances. Otherwise,forecast
sets all necessary presample error model innovations to zero.
PresampleRegressionDisturbanceVariable
— Presample unconditional disturbance variable ut to select from Presample
string scalar | character vector | integer | logical vector
Since R2023b
Presample unconditional disturbance variable
ut to select from
Presample
containing presample unconditional disturbance data,
specified as one of the following data types:
String scalar or character vector containing a variable name in
Presample.Properties.VariableNames
Variable index (positive integer) to select from
Presample.Properties.VariableNames
A logical vector, where
PresampleRegressionDisturbanceVariable(
selects variablej
) = true
fromj
Presample.Properties.VariableNames
The selected variable must be a numeric vector and cannot contain missing values
(NaN
s).
If you specify presample unconditional disturbance data in
Presample
, you must specify
PresampleRegressionDisturbanceVariable
.
Example: PresampleRegressionDisturbanceVariable="StockRateU0"
Example: PresampleRegressionDisturbanceVariable=[false false true
false]
or PresampleRegressionDisturbanceVariable=3
selects the third table variable as the presample unconditional disturbance
variable.
Data Types: double
| logical
| char
| cell
| string
InSample
— Forecasted (future) predictor data
table | timetable
Since R2023b
Forecasted (future) predictor data for the model regression component, specified as
a table or timetable. InSample
contains numvars
variables, including numpreds
predictor variables
xt.
forecast
returns the forecasted variables in the output
table or timetable Tbl
, which is commensurate with
InSample
.
Each row corresponds to an observation in the forecast horizon, the first row is the
earliest observation, and measurements in each row, among all paths, occur
simultaneously. InSample
must have at least
numperiods
rows to cover the forecast horizon. If you supply more
rows than necessary, forecast
uses only the first
numperiods
rows.
Each selected predictor variable is a numeric vector without missing values
(NaN
s). forecast
applies the specified
predictor variables to all forecasted paths.
If InSample
is a timetable, the following conditions apply:
If InSample
is a table, the last row contains the latest
observation.
By default, forecast
does not include the regression
component in the model, regardless of the value of Mdl.Beta
.
PredictorVariables
— Predictor variables xt to select from InSample
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2023b
Predictor variables xt to select from
InSample
containing predictor data for the model regression
component in the forecast horizon, specified as one of the following data types:
String vector or cell vector of character vectors containing
numpreds
variable names inInSample.Properties.VariableNames
A vector of unique indices (positive integers) of variables to select from
InSample.Properties.VariableNames
A logical vector, where
PredictorVariables(
selects variablej
) = true
fromj
InSample.Properties.VariableNames
The selected variables must be numeric vectors and cannot contain missing values
(NaN
s).
By default, forecast
excludes the regression component,
regardless of its presence in Mdl
.
Example: PredictorVariables=["M1SL" "TB3MS"
"UNRATE"]
Example: PredictorVariables=[true false true false]
or
PredictorVariable=[1 3]
selects the first and third table variables
to supply the predictor data.
Data Types: double
| logical
| char
| cell
| string
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: For example, forecast(Mdl,10,Y0=y0,X0=Pred0,XF=Pred)
specifies the presample response path y0
, and the presample and forecast
sample predictor data Pred0
and Pred
, respectively,
to forecast a model with a regression component.
Y0
— Presample response data yt to infer presample unconditional disturbances ut
numeric column vector | numeric matrix
Presample response data yt to infer
presample unconditional disturbances ut,
specified as a numpreobs
-by-1 numeric column vector or a
numpreobs
-by-numpaths
numeric matrix. When you
supply Y0
, supply all optional data as numeric arrays, and
forecast
returns results in numeric arrays.
Presample unconditional disturbances ut
are required to initialize the error model for forecasting.
forecast
infers presample unconditional disturbances from
Y0
and specified presample predictor data
X0
. Therefore, if you specify presample unconditional
disturbances U0
, forecast
ignores
Y0
and X0
.
numpreobs
is the number of presample observations.
numpaths
is the number of independent presample paths, from which
forecast
initializes the resulting
numpaths
forecasts (see Algorithms).
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
numpreobs
must be at least Mdl.P
to initialize
the model. If numpreobs
> Mdl.P
,
forecast
uses only the latest Mdl.P
rows.
For more details, see Time Base Partitions for Forecasting.
Columns of Y0
correspond to separate, independent presample
paths.
If
Y0
is a column vector, it represents a single path of the response series.forecast
applies it to each forecasted path. In this case, all forecast pathsY
derive from the same initial responses.If
Y0
is a matrix, each column represents a presample path of the response series.numpaths
is the maximum among the second dimensions of the specified presample observation matricesY0
,E0
, andU0
.
By default, forecast
defers to specified or default
presample unconditional disturbances U0
.
Data Types: double
X0
— Presample predictor data xt used to infer the presample unconditional disturbances ut
numeric matrix
Presample predictor data xt used to
infer the presample unconditional disturbances
ut, specified as a
numpreobs
-by-numpreds
numeric matrix. Use
X0
only when you supply the numeric array of presample response
data Y0
and your model contains a regression component.
numpreds
= numel(Mdl.Beta)
.
Presample unconditional disturbances ut
are required to initialize the error model for forecasting.
forecast
infers presample unconditional disturbances from
X0
and specified presample response data
Y0
. Therefore, if you specify presample unconditional
disturbances U0
, forecast
ignores
Y0
and X0
.
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
numpreobs
must be at least Mdl.P
to initialize
the model. If numpreobs
> Mdl.P
,
forecast
uses only the latest Mdl.P
rows.
For more details, see Time Base Partitions for Forecasting.
Each column is an individual predictor variable. forecast
applies X
to each path; that is, X
represents
one path of observed predictors.
If you specify X0
but you do not specify forecasted predictor
data XF
, forecast
issues an error.
By default, forecast
drops the regression component from
the model when it infers presample unconditional disturbances, regardless of the value
of the regression coefficient Mdl.Beta
.
Data Types: double
U0
— Presample unconditional disturbance data ut
numeric column vector | numeric matrix
Presample unconditional disturbance data
ut to initialize the autoregressive (AR)
component of the ARIMA error model, specified as a numpreobs
-by-1
numeric column vector or a numpreobs
-by-numpaths
numeric matrix. When you supply U0
, supply all optional data as
numeric arrays, and forecast
returns results in numeric
arrays.
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
numpreobs
must be at least Mdl.P
to initialize
the model. If numpreobs
> Mdl.P
,
forecast
uses only the latest Mdl.P
rows.
For more details, see Time Base Partitions for Forecasting.
Columns of U0
correspond to separate, independent presample
paths.
If
U0
is a column vector, it represents a single path of the unconditional disturbance series.forecast
applies it to each forecasted path. In this case, all forecasted paths derive from the same initial responses.If
U0
is a matrix, each column represents a presample path of the unconditional disturbance series.numpaths
is the maximum among the second dimensions of the specified presample observation matricesY0
,E0
, andU0
.
By default, if the presample data (Y0
and
X0
) contains at least Mdl.P
rows,
forecast
infers U0
from the presample
data. If you do not specify presample data, then all required presample unconditional
disturbances are zero.
Data Types: double
E0
— Presample error model innovation data εt
numeric column vector | numeric matrix
Presample error model innovation data εt
used to initialize either the moving average (MA) component of the ARIMA error model,
specified as a numpreobs
-by-1 column vector or
numpreobs
-by-numpaths
numeric matrix. Use
E0
only when you supply the numeric array of presample response
data Y0
. forecast
assumes that the
presample innovations have a mean of zero.
Each row is a presample observation, and measurements in each row occur
simultaneously. The last row contains the latest presample observation.
numpreobs
must be at least Mdl.Q
to initialize
the model. If numpreobs
is greater than required,
forecast
uses only the latest required rows.
Columns of E0
correspond to separate, independent presample
paths.
If
E0
is a column vector, it represents a single path of the innovation series.forecast
applies it to each forecasted path. In this case, all forecasts derive from the same initial error model innovations.If
E0
is a matrix, each column represents a presample path of the error model innovation series.numpaths
is the maximum among the second dimensions of the specified presample observation matricesY0
,U0
, andU0
.
By default, if U0
contains at least Mdl.P
+ Mdl.Q
rows, forecast
infers
E0
from U0
. If U0
has an
insufficient number of rows and forecast
cannot infer
sufficient observations of U0
from the presample data
(Y0
and X0
), forecast
sets necessary presample error model innovations to zero.
Data Types: double
PresampleResponseVariable
— Response variable yt to select from Presample
string scalar | character vector | integer | logical vector
Since R2023b
Response variable yt to select from
Presample
containing the presample response data, specified as
one of the following data types:
String scalar or character vector containing a variable name in
Presample.Properties.VariableNames
Variable index (positive integer) to select from
Presample.Properties.VariableNames
A logical vector, where
PreampleResponseVariable(
selects variablej
) = true
fromj
Presample.Properties.VariableNames
forecast
uses specified presample response and predictor
data to infer presample unconditional disturbances. If you specify enough presample
unconditional disturbances or error model innovations by using
Presample
and
PresampleRegressionDisturbanceVariable
or
PresampleInnovationVariable
, forecast
ignores PresamplePredictorVariables
and
PresampleResponseVariable
.
The selected variable must be a numeric vector and cannot contain missing values
(NaN
s).
If you specify presample response data by using the Presample
name-value argument, you must specify
PresampleResponseVariable
.
Example: PresampleResponseVariable="StockRate"
Example: PresampleResponseVariable=[false false true false]
or
PresampleResponseVariable=3
selects the third table variable as
the response variable.
Data Types: double
| logical
| char
| cell
| string
PresamplePredictorVariables
— Presample predictor variables xt to select from Presample
string vector | cell vector of character vectors | vector of integers | logical vector
Since R2023b
Presample predictor variables xt to
select from Presample
containing presample predictor data for the
regression component in the presample period, specified as one of the following data types:
String vector or cell vector of character vectors containing
numpreds
variable names inPresample.Properties.VariableNames
A vector of unique indices (positive integers) of variables to select from
Presample.Properties.VariableNames
A logical vector, where
PresamplePredictorVariables(
selects variablej
) = true
fromj
Presample.Properties.VariableNames
forecast
uses specified presample response and predictor
data to infer presample unconditional disturbances. If you specify enough presample
unconditional disturbances or error model innovations by using
Presample
and
PresampleRegressionDisturbanceVariable
or
PresampleInnovationVariable
, forecast
ignores PresamplePredictorVariables
and
PresampleResponseVariable
.
The selected variables must be numeric vectors and cannot contain missing values
(NaN
s).
If you specify presample predictor data, you must also specify in-sample predictor
data by using the InSample
and
PredictorVariables
name-value arguments.
By default, forecast
excludes the regression component,
regardless of its presence in Mdl
.
Example: PresamplePredictorVariables=["M1SL" "TB3MS"
"UNRATE"]
Example: PresamplePredictorVariables=[true false true false]
or
PredictorVariable=[1 3]
selects the first and third table
variables to supply the predictor data.
Data Types: double
| logical
| char
| cell
| string
PresampleInnovationVariable
— Presample error model innovation variable of εt to select from Presample
string scalar | character vector | integer | logical vector
Since R2023b
Presample error model innovation variable of
εt to select from
Presample
containing presample error model innovation data,
specified as one of the following data types:
String scalar or character vector containing a variable name in
Presample.Properties.VariableNames
Variable index (positive integer) to select from
Presample.Properties.VariableNames
A logical vector, where
PresampleInnovationVariable(
selects variablej
) = true
fromj
Presample.Properties.VariableNames
The selected variable must be a numeric matrix and cannot contain missing values
(NaN
s).
If you specify presample error model innovation data in
Presample
, you must specify
PresampleInnovationVariable
.
Example: PresampleInnovationVariable="StockRateDist0"
Example: PresampleInnovationVariable=[false false true false]
or
PresampleInnovationVariable=3
selects the third table variable as
the presample error model innovation variable.
Data Types: double
| logical
| char
| cell
| string
XF
— Forecasted (or future) predictor data
numeric matrix
Forecasted (or future) predictor data, specified as a numeric matrix with
numpreds
columns. XF
represents the evolution
of specified presample predictor data X0
forecasted into the
future (the forecast period). Use XF
only when you supply the
numeric array of presample response and predictor data Y0
and
X0
, respectively.
Rows of XF
correspond to time points in the future;
XF(
contains the
t
,:)t
-period-ahead predictor forecasts. XF
must have at least numperiods
rows. If the number of rows exceeds
numperiods
, forecast
uses only the first
(earliest) numperiods
forecasts. For more details, see Time Base Partitions for Forecasting.
Columns of XF
are separate time series variables, and they
correspond to the columns of X0
and
Mdl.Beta
.
forecast
treats XF
as a fixed
(nonstochastic) matrix.
By default, the forecast
function generates forecasts from
Mdl
without a regression component, regardless of the value of
the regression coefficient Mdl.Beta
.
Note
NaN
values inX0
,Y0
,U0
,E0
, andXF
indicate missing values.forecast
removes missing values from specified data by list-wise deletion.For the presample,
forecast
horizontally concatenates the possibly jagged arraysX0
,Y0
,U0
, andE0
with respect to the last rows, and then it removes any row of the concatenated matrix containing at least oneNaN
.For in-sample data,
forecast
removes any row ofXF
containing at least oneNaN
.
This type of data reduction reduces the effective sample size and can create an irregular time series.
For numeric data inputs,
forecast
assumes that you synchronize the presample data such that the latest observations occur simultaneously.forecast
issues an error when any table or timetable input contains missing values.Set presample response and predictor data to the same response and predictor data as used in the estimation, simulation, or inference of
Mdl
. This assignment ensures correct inference of the required presample unconditional disturbances.To include a regression component in the response forecast, you must specify the forecasted predictor data. You can specify forecasted predictor data without also specifying presample predictor data, but
forecast
issues an error when you specify presample predictor data without also specifying forecasted predictor data.
Output Arguments
Y
— MMSE forecasted responses
numeric column vector | numeric matrix
MMSE forecasted responses yt, returned as
a numperiods
-by-1 column vector or a
numperiods
-by-numpaths
numeric matrix.
Y
represents a continuation of Y0
(Y(1,:)
occurs in the time point immediately after
Y0(end,:)
). forecast
returns
Y
by default and when you supply optional data presample data in
numeric arrays.
Y(
contains the
t
,:)t
-period-ahead forecasts, or the
forecast of all paths for time point t
in the forecast
period.
forecast
determines numpaths
from the
number of columns in the presample data sets Y0
,
E0
, and U0
. For details, see Algorithms. If each presample data set
has one column, Y
is a column vector.
Data Types: double
YMSE
— MSE of forecasted responses
numeric column vector | numeric matrix
MSE of the forecasted responses Y
(forecast error variances),
returned as a numperiods
-by-1 column vector or a
numperiods
-by-numpaths
numeric matrix.
forecast
returns YMSE
by default and when
you supply optional data presample data in numeric arrays.
YMSE(
contains the forecast error
variances of all paths for time point t
,:)t
in the forecast
period.
forecast
determines numpaths
from the
number of columns in the presample data sets Y0
,
E0
, and U0
. For details, see Algorithms. If you do not specify any
presample data sets, or if each data set is a column vector, YMSE
is
a column vector.
The square roots of YMSE
are the standard errors of the forecasts
Y
.
Data Types: double
U
— MMSE forecasts of ARIMA error model unconditional disturbances
numeric matrix
MMSE forecasts of ARIMA error model unconditional disturbances, returned as a
numperiods
-by-1 column vector or a
numperiods
-by-numpaths
numeric matrix.
U
represents a continuation of U0
(U(1,:)
occurs in the time point immediately after
U0(end,:)
). forecast
returns
U
by default and when you supply optional data presample data in
numeric arrays.
U(
contains the
t
,:)t
-period-ahead forecasted unconditional
disturbances, or the conditional mean forecast of the error model over all
paths for time point t
in the forecast period.
forecast
determines numpaths
from the
number of columns in the presample data sets Y0
,
E0
, and U0
. For details, see Algorithms.
Data Types: double
Tbl
— Paths of MMSE forecasts of responses yt, corresponding forecast MSEs, and MMSE forecasts of unconditional disturbances ut
table | timetable
Since R2023b
Paths of MMSE forecasts of responses yt,
corresponding forecast MSEs, and MMSE forecasts of unconditional disturbances
ut, returned as a table or timetable, the
same data type as Presample
or InSample
.
forecast
returns Tbl
only when you supply
Presample
or InSample
.
Tbl
contains the following variables:
The forecasted response paths, which are in a
numperiods
-by-numpaths
numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths, each corresponding to the input presample paths inPresample
or preceding the in-sample period inInSample
.forecast
names the forecasted response variable
, whereresponseName
_Response
isresponseName
Mdl.SeriesName
. For example, ifMdl.SeriesName
isGDP
,Tbl
contains a variable for the corresponding forecasted response paths with the nameGDP_Response
.Each path in
Tbl.
represents the continuation of the corresponding presample response path inresponseName
_ResponsePresample
(Tbl.
occurs in the next time point, with respect to the periodicityresponseName
_Response(1,:)Presample
, after the last presample response).Tbl.
contains theresponseName
_Response(j
,k
)j
-period-ahead forecasted response of pathk
.The forecast MSE paths, which are in a
numperiods
-by-numpaths
numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths, each corresponding to the forecasted responses inTbl.
.responseName
_Responseforecast
names the forecast MSEs
, whereresponseName
_MSE
isresponseName
Mdl.SeriesName
. For example, ifMdl.SeriesName
isGDP
,Tbl
contains a variable for the corresponding forecast MSE with the nameGDP_MSE
.The forecasted unconditional disturbance paths, which are in a
numperiods
-by-numpaths
numeric matrix, with rows representing periods in the forecast horizon and columns representing independent paths.forecast
names the forecasted unconditional disturbance variable
, whereresponseName
_RegressionInnovation
isresponseName
Mdl.SeriesName
. For example, ifMdl.SeriesName
isGDP
,Tbl
contains a variable for the corresponding forecasted unconditional disturbance paths with the nameGDP_RegressionInnovation
.Each path in
Tbl.
represents a continuation of the presample unconditional disturbance process, either supplied by or inferred fromresponseName
_RegressionInnovationPresample
, or set by default (Tbl.
occurs in the next time point, with respect to the periodicityresponseName
_RegressionInnovation(1,:)Presample
, after the last presample unconditional disturbance).Tbl.
contains theresponseName
_RegressionInnovation(j
,k
)j
-period-ahead forecasted unconditional disturbance of pathk
.When you supply
InSample
,Tbl
contains all variables inInSample
.
If Presample
is a timetable, the following conditions hold:
The row order of
Tbl
, either ascending or descending, matches the row order ofPresample
.Tbl.Time(1)
is the next time afterPresample.Time(end)
relative the sampling frequency, andTbl.Time(2:numobs)
are the following times relative to the sampling frequency.
More About
Time Base Partitions for Forecasting
Time base partitions for forecasting are two
disjoint, contiguous intervals of the time base; each interval contains time series data for
forecasting a dynamic model. The forecast period (forecast horizon)
is a numperiods
length partition at the end of the time base during
which forecast
generates forecasts Y
from the
dynamic model Mdl
. The presample period is the
entire partition occurring before the forecast period. forecast
can
require observed responses Y0
, regression data X0
,
unconditional disturbances U0
, or innovations E0
in the presample period to initialize the dynamic model for forecasting. The model structure
determines the types and amounts of required presample observations.
A common practice is to fit a dynamic model to a portion of the data set, then validate
the predictability of the model by comparing its forecasts to observed responses. During
forecasting, the presample period contains the data to which the model is fit, and the
forecast period contains the holdout sample for validation. Suppose that
yt is an observed response series;
x1,t,
x2,t, and
x3,t are observed exogenous
series; and time t = 1,…,T. Consider forecasting
responses from a dynamic model of yt containing a
regression component numperiods
= K periods. Suppose
that the dynamic model is fit to the data in the interval [1,T –
K] (for more details, see estimate
). This figure shows the time base partitions for forecasting.
For example, to generate forecasts Y
from a regression model with
AR(2) errors, forecast
requires presample unconditional disturbances
U0
and future predictor data XF
.
forecast
infers unconditional disturbances given enough readily available presample responses and predictor data. To initialize an AR(2) error model,Y0
= andX0
= .To model,
forecast
requires future exogenous dataXF
= .
This figure shows the arrays of required observations for the general case, with corresponding input and output arguments.
Algorithms
The
forecast
function sets the number of sample pathsnumpaths
to the maximum number of columns among the specified presample data sets:For input numeric arrays of presample data,
numpaths
is the maximum width amongY0
,E0
, andU0
.For an input table or timetable of presample data,
numpaths
is the maximum width among the variables representing the presample responsesPresampleResponseVariable
, error model innovationsPresampleInnovationVariable
, and unconditional disturbancesPresampleRegressionDisturbanceVariable
.
All specified presample data sets must have either one column or
numpaths
> 1 columns. Otherwise,forecast
issues an error. For example, if you supplyY0
andE0
, andY0
has five columns representing five paths, thenE0
can have one column or five columns. IfE0
has one column,forecast
appliesE0
to each path.forecast
computes the forecasted response MSEs by treating the predictor data matrices as nonstochastic and statistically independent of the model innovations. Therefore, the forecast MSEs reflect the variances associated with the unconditional disturbances of the ARIMA error model alone.forecast
uses presample response and predictor data to infer presample unconditional disturbances. Therefore, if you specify presample unconditional disturbances,forecast
ignores any specified presample response and predictor data.
References
[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.
[2] Davidson, R., and J. G. MacKinnon. Econometric Theory and Methods. Oxford, UK: Oxford University Press, 2004.
[3] Enders, Walter. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.
[4] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.
[5] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons, Inc., 1991.
[6] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005.
Version History
Introduced in R2013bR2023b: forecast
accepts input data in tables and timetables, and returns results in tables and timetables
In addition to accepting input presample and in-sample data in numeric arrays,
forecast
accepts input data in tables or regular timetables. Use
Presample
to supply presample data and InSample
to provide in-sample (future) predictor data for the model regression component in the
forecast horizon.
When you supply data in a table or timetable, the following conditions apply:
forecast
chooses the default presample response series on which to operate, but you can use the optionalPresampleResponseVariable
name-value argument to select a different variable.forecast
returns results in a table or timetable.
Name-value arguments to support tabular workflows include:
Presample
specifies the input table or timetable of presample response, predictor, error model disturbance, or regression innovation data.PresampleResponseVariable
specifies the name of the response series to select fromPresample
.PresamplePredictorVariables
specifies the names of predictor series to select fromPresample
.PresampleRegressionDisturbanceVariable
specifies the name of the unconditional disturbance series to select fromPresample
.PresampleInnovationVariable
specifies the name of the error model innovation series to select fromPresample
.InSample
specifies the table or regular timetable of future predictor data for a model regression component.PredictorVariables
specifies the names of the predictor series to select fromInSample
.
See Also
Objects
Functions
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)