Filter Data Through State-Space Model in Real Time
This example shows how to nowcast a state-space model. In other words, the example updates the state distribution moments as economic data becomes available, even when individual measurements are released asynchronously.
This example uses a state-space formulation for a dynamic factor model. The model variables include:
Two states, which compose a vector autoregression of demand and supply.
Quarterly growth rates of US CPI, GDP, 3-month treasury bill yield, and the unemployment rate. Each measurement is a variable in the observation equation and each is a linear combination of supply and demand perturbed by measurement error.
Symbolically, the state-space formulation is
To nowcast the model, the example performs the following procedure:
Create a state-space model template for estimation.
Define a holdout set from the estimation data.
Fit the model to the in-sample data.
Configure the holdout data set to simulate asynchronously released measurements.
Nowcast the estimated model to update the state distribution states.
Create State-Space Model Estimation Template
To estimate a state-space model, the estimate
function requires a template specifying the structure of the model, which parameters are unknown or fixed, and other characteristics.
Create a state-space model for the dynamic factor model. Specify NaN values for unknown parameters, and specify that the states are stationary processes by setting their type (StateType
) to 0
.
numStates = 2; numSeries = 4; A = tril(nan(numStates)); B = eye(numStates); C = nan(numSeries,numStates); D = diag(NaN(numSeries,1)); Mdl = ssm(A,B,C,D,StateType=[0 0]);
Mdl
is an ssm
object representing a template for estimation.
Load and Preprocess Data
Load the US macroeconomic data set Data_USEconModel.mat
. The data contains quarterly measurements of 14 economic series from Q1:1947 through Q1:2009.
load Data_USEconModel
DataTimeTable
is a 249-by-14 timetable that contains the data and sample timestamps. For more details, enter Description
at the command line.
Extract the CPI, GDP, 3-month treasury bill yield, and the unemployment rate.
DataLevel = DataTimeTable{:,["CPIAUCSL" "GDP" "TB3MS" "UNRATE"]}; anyMissing = any(any(ismissing(DataLevel)))
anyMissing = logical
1
DataLevel
is a 249-by-4 numeric matrix of the measurements. At least one series contains missing values, but the Kalman filter accommodates missing observations by rolling forward the current state distribution as the next state forecast.
Convert the levels to returns by passing the data to price2ret
.
DataGrowth = price2ret(DataLevel)*100;
Because the transformation uses the first difference, DataGrowth
has one less observation than DataLevel
.
Reserve the final 3 years (12 quarters) of data as a holdout sample. Extract corresponding dates.
numPeriods = 12; Y = DataGrowth(1:(end-numPeriods),:); % In-sample data Yf = DataGrowth((end-numPeriods+1):end,:); % Holdout sample T = size(DataGrowth,1); dates = DataTimeTable.Time(2:end-numPeriods); datesf = DataTimeTable.Time(end-numPeriods+1:end);
Estimate Model
The estimation procedure requires initial values for all unknown parameters. Create a vector of containing initial values. For example, if A0
, C0
, and D0
are matrices containing initial values for A
, B
, and C
, arrange the initial values by creating the vector [A0(:,1); A0(:,2); C0(:,1); C0(:,2); D0(:,1); D0(:,2); D0(:,3); D0(:,4)]
. Only unknown parameters require initial values.
params0 = [0.8; 0; 0.5; ones(numStates*numSeries,1); 0.5*std(Y,"omitNan")'];
Fit the model to the in-sample data. Specify the initial values and the univariate treatment of the multivariate model. Turn off the estimation summary.
EstMdl = estimate(Mdl,Y,params0,Univariate=1,Display="off")
EstMdl = State-space model type: ssm State vector length: 2 Observation vector length: 4 State disturbance vector length: 2 Observation innovation vector length: 4 Sample size supported by model: Unlimited State variables: x1, x2,... State disturbances: u1, u2,... Observation series: y1, y2,... Observation innovations: e1, e2,... State equations: x1(t) = (0.60)x1(t-1) + u1(t) x2(t) = -(0.23)x1(t-1) + (0.99)x2(t-1) + u2(t) Observation equations: y1(t) = -(0.07)x1(t) + (0.12)x2(t) + (0.61)e1(t) y2(t) = -(0.70)x1(t) + (0.18)x2(t) + (0.63)e2(t) y3(t) = -(6.87)x1(t) + (0.07)x2(t) + (16.18)e3(t) y4(t) = (5.30)x1(t) + (0.07)x2(t) + (3.96)e4(t) Initial state distribution: Initial state means x1 x2 0 0 Initial state covariance matrix x1 x2 x1 1.57 -0.53 x2 -0.53 76.34 State types x1 x2 Stationary Stationary
Check whether the estimated VAR state model is stable.
varlagop = LagOp({eye(numStates) EstMdl.A}); isStable(varlagop)
ans = logical
1
The VAR state model is stable.
Simulate Asynchronous Release of Measurements
To simulate the asynchronous release of economic measurements within a quarter, suppose each measurement is released in successive weeks. For observation t
in the holdout sample:
Create a
numSeries
-by-numSeries
matrix ofNaN
values.Replace the diagonal elements of the NaN matrix with the observations of period
t
.
Stack each matrix.
YfStretch = nan(numPeriods*numSeries,numSeries); for t = 1:numPeriods for j = 1:numSeries YfStretch((t-1)*numSeries + j,j) = Yf(t,j); end end
YfStretch
is a numPeriods*numSeries
-by-numSeries
(48-by-4) block matrix of asynchronously released measurements over the holdout sample period. For example, block 1 is YfStretch(1:4,:)
and it represents the first observation in the holdout sample Yf(1,:)
where the four measurements are released at different times or subperiods (rows), block 2 is YfStretch(5:8,:)
and it represents the second observation in the holdout sample Yf(2,:)
where the four measurement are released at different times, and so on.
The estimated model EstMdl
is calibrated so that states transition each period (row in the data). States in the holdout sample must transition only after processing the incoming unemployment growth rate. To achieve this action, create a model for the subperiod observations so that the states remain the same as it processes an observation. Symbolically,
with a degenerate initial state distribution.
EstMdlAux = ssm(eye(numStates),zeros(numStates),EstMdl.C,EstMdl.D,...
Mean0=zeros(numStates,1),Cov0=zeros(numStates));
Nowcast the Model
Obtain state distribution moments at the end of the in-sample period.
[states,~,output] = filter(EstMdl,Y); currentState = states(end,:)'; CurrentStateCov = output(end).FilteredStatesCov;
Preallocate arrays for the nowcasts and dates for the stretched periods.
newStateF = nan(numPeriods*numSeries,numStates); NewStateCovF = cell(numPeriods*numSeries,1); datesfStretch = repmat(dates(end),numPeriods*numSeries,1);
Nowcast the model by using this procedure for each block matrix in the stretched holdout sample:
When a CPI measurement is released, update the current state distribution moments by filtering the measurement through the estimated model
EstMdl
.Suppose subsequent measurements are released one week apart from each other. When each other measurement is released, update the current state distribution moments by filtering the measurement through the estimated model
EstMdlAux
.Replace the current state distribution with the updated distribution after each iteration.
tStretch = 0; % Index of stretched forecast horizon wk = 0; for t = 1:numPeriods for j = 1:numSeries tStretch = tStretch + 1; if j == 1 % Advance to next full period [newStateF(tStretch,:),NewStateCovF{tStretch}] = update(EstMdl,YfStretch(tStretch,:), ... currentState,CurrentStateCov); wk = 0; datesfStretch(tStretch) = datesf(t); else % Stretch current period [newStateF(tStretch,:),NewStateCovF{tStretch}] = update(EstMdlAux,YfStretch(tStretch,:), ... currentState,CurrentStateCov); wk = wk + 1; datesfStretch(tStretch) = datesf(t) + wk; end currentState = newStateF(tStretch,:)'; CurrentStateCov = NewStateCovF{tStretch}; end end
newStateF
is a 48-by-2 block matrix of nowcasted demands and supplies during the subperiods of the holdout period. NewStateCovF
is a 48-by-1 cell array of 2-by-2 covariance matrices for each nowcast.
Plot the in-sample demand and supply estimates for periods beyond 1993, and plot the nowcasts.
dtp = dates > datetime(1994,1,1); figure h = plot(dates(dtp),states(dtp,:),"-",datesfStretch,newStateF,"--"); hold on h(3).Color = h(1).Color; h(4).Color = h(2).Color; yl = ylim; hp = patch([datesfStretch(1) datesfStretch(end) datesfStretch(end) datesfStretch(1)], ... yl([1,1,2,2]),[0.8 0.8 0.8]); uistack(hp,'bottom'); legend([h; hp],... ["Demand (in-sample)"; "Supply (in-sample)"; "Demand (updated)"; "Supply (updated)"; "Forecast horizon"],... Location="best") hold off
See Also
ssm
| update
| estimate
| filter