Main Content

Filter Data Through State-Space Model in Real Time

This example shows how to nowcast a state-space model. In other words, the example updates the state distribution moments as economic data becomes available, even when individual measurements are released asynchronously.

This example uses a state-space formulation for a dynamic factor model. The model variables include:

  • Two states, which compose a vector autoregression of demand and supply.

  • Quarterly growth rates of US CPI, GDP, 3-month treasury bill yield, and the unemployment rate. Each measurement is a variable in the observation equation and each is a linear combination of supply and demand perturbed by measurement error.

Symbolically, the state-space formulation is

[x1,tx2,t]=[ϕ110ϕ21ϕ22][x1,t-1x2,t-1]+[1001][u1,tu2,t][y1,ty2,ty3,ty4,t]=[C11C12C21C22C31C32C41C42][x1,tx2,t]+[D110000D220000D330000D44][ε1,tε2,tε3,tε4,t].

To nowcast the model, the example performs the following procedure:

  1. Create a state-space model template for estimation.

  2. Define a holdout set from the estimation data.

  3. Fit the model to the in-sample data.

  4. Configure the holdout data set to simulate asynchronously released measurements.

  5. Nowcast the estimated model to update the state distribution states.

Create State-Space Model Estimation Template

To estimate a state-space model, the estimate function requires a template specifying the structure of the model, which parameters are unknown or fixed, and other characteristics.

Create a state-space model for the dynamic factor model. Specify NaN values for unknown parameters, and specify that the states are stationary processes by setting their type (StateType) to 0.

numStates = 2;
numSeries = 4;
A = tril(nan(numStates));
B = eye(numStates);
C = nan(numSeries,numStates);
D = diag(NaN(numSeries,1));
Mdl = ssm(A,B,C,D,StateType=[0 0]);

Mdl is an ssm object representing a template for estimation.

Load and Preprocess Data

Load the US macroeconomic data set Data_USEconModel.mat. The data contains quarterly measurements of 14 economic series from Q1:1947 through Q1:2009.

load Data_USEconModel

DataTimeTable is a 249-by-14 timetable that contains the data and sample timestamps. For more details, enter Description at the command line.

Extract the CPI, GDP, 3-month treasury bill yield, and the unemployment rate.

DataLevel = DataTimeTable{:,["CPIAUCSL" "GDP" "TB3MS" "UNRATE"]};
anyMissing = any(any(ismissing(DataLevel)))
anyMissing = logical
   1

DataLevel is a 249-by-4 numeric matrix of the measurements. At least one series contains missing values, but the Kalman filter accommodates missing observations by rolling forward the current state distribution as the next state forecast.

Convert the levels to returns by passing the data to price2ret.

DataGrowth = price2ret(DataLevel)*100;

Because the transformation uses the first difference, DataGrowth has one less observation than DataLevel.

Reserve the final 3 years (12 quarters) of data as a holdout sample. Extract corresponding dates.

numPeriods = 12;
Y = DataGrowth(1:(end-numPeriods),:);       % In-sample data 
Yf = DataGrowth((end-numPeriods+1):end,:);  % Holdout sample
T = size(DataGrowth,1);

dates = DataTimeTable.Time(2:end-numPeriods);
datesf = DataTimeTable.Time(end-numPeriods+1:end);

Estimate Model

The estimation procedure requires initial values for all unknown parameters. Create a vector of containing initial values. For example, if A0, C0, and D0 are matrices containing initial values for A, B, and C, arrange the initial values by creating the vector [A0(:,1); A0(:,2); C0(:,1); C0(:,2); D0(:,1); D0(:,2); D0(:,3); D0(:,4)]. Only unknown parameters require initial values.

params0 = [0.8; 0; 0.5; ones(numStates*numSeries,1); 0.5*std(Y,"omitNan")'];

Fit the model to the in-sample data. Specify the initial values and the univariate treatment of the multivariate model. Turn off the estimation summary.

EstMdl = estimate(Mdl,Y,params0,Univariate=1,Display="off")
EstMdl = 
State-space model type: ssm

State vector length: 2
Observation vector length: 4
State disturbance vector length: 2
Observation innovation vector length: 4
Sample size supported by model: Unlimited

State variables: x1, x2,...
State disturbances: u1, u2,...
Observation series: y1, y2,...
Observation innovations: e1, e2,...

State equations:
x1(t) = (0.60)x1(t-1) + u1(t)
x2(t) = -(0.23)x1(t-1) + (0.99)x2(t-1) + u2(t)

Observation equations:
y1(t) = -(0.07)x1(t) + (0.12)x2(t) + (0.61)e1(t)
y2(t) = -(0.70)x1(t) + (0.18)x2(t) + (0.63)e2(t)
y3(t) = -(6.87)x1(t) + (0.07)x2(t) + (16.18)e3(t)
y4(t) = (5.30)x1(t) + (0.07)x2(t) + (3.96)e4(t)

Initial state distribution:

Initial state means
 x1  x2 
  0   0 

Initial state covariance matrix
     x1     x2    
 x1   1.57  -0.53 
 x2  -0.53  76.34 

State types
     x1          x2     
 Stationary  Stationary 

Check whether the estimated VAR state model is stable.

varlagop = LagOp({eye(numStates) EstMdl.A});
isStable(varlagop)
ans = logical
   1

The VAR state model is stable.

Simulate Asynchronous Release of Measurements

To simulate the asynchronous release of economic measurements within a quarter, suppose each measurement is released in successive weeks. For observation t in the holdout sample:

  1. Create a numSeries-by-numSeries matrix of NaN values.

  2. Replace the diagonal elements of the NaN matrix with the observations of period t.

Stack each matrix.

YfStretch = nan(numPeriods*numSeries,numSeries);
for t = 1:numPeriods
    for j = 1:numSeries
        YfStretch((t-1)*numSeries + j,j) = Yf(t,j);
    end
end

YfStretch is a numPeriods*numSeries-by-numSeries (48-by-4) block matrix of asynchronously released measurements over the holdout sample period. For example, block 1 is YfStretch(1:4,:) and it represents the first observation in the holdout sample Yf(1,:) where the four measurements are released at different times or subperiods (rows), block 2 is YfStretch(5:8,:) and it represents the second observation in the holdout sample Yf(2,:) where the four measurement are released at different times, and so on.

The estimated model EstMdl is calibrated so that states transition each period (row in the data). States in the holdout sample must transition only after processing the incoming unemployment growth rate. To achieve this action, create a model for the subperiod observations so that the states remain the same as it processes an observation. Symbolically,

[x1,tx2,t]=[1001][x1,t-1x2,t-1][y1,ty2,ty3,ty4,t]=[Cˆ11Cˆ12Cˆ21Cˆ22Cˆ31Cˆ32Cˆ41Cˆ42][x1,tx2,t]+[Dˆ110000Dˆ220000Dˆ330000Dˆ44][ε1,tε2,tε3,tε4,t],

with a degenerate initial state distribution.

EstMdlAux = ssm(eye(numStates),zeros(numStates),EstMdl.C,EstMdl.D,...
    Mean0=zeros(numStates,1),Cov0=zeros(numStates));

Nowcast the Model

Obtain state distribution moments at the end of the in-sample period.

[states,~,output] = filter(EstMdl,Y);
currentState = states(end,:)';
CurrentStateCov = output(end).FilteredStatesCov;

Preallocate arrays for the nowcasts and dates for the stretched periods.

newStateF = nan(numPeriods*numSeries,numStates);
NewStateCovF = cell(numPeriods*numSeries,1);
datesfStretch = repmat(dates(end),numPeriods*numSeries,1);

Nowcast the model by using this procedure for each block matrix in the stretched holdout sample:

  1. When a CPI measurement is released, update the current state distribution moments by filtering the measurement through the estimated model EstMdl.

  2. Suppose subsequent measurements are released one week apart from each other. When each other measurement is released, update the current state distribution moments by filtering the measurement through the estimated model EstMdlAux.

  3. Replace the current state distribution with the updated distribution after each iteration.

tStretch = 0;   % Index of stretched forecast horizon
wk = 0;         

for t = 1:numPeriods
    for j = 1:numSeries
        tStretch = tStretch + 1;

        if j == 1   % Advance to next full period
            [newStateF(tStretch,:),NewStateCovF{tStretch}] = update(EstMdl,YfStretch(tStretch,:), ...
                currentState,CurrentStateCov);
            wk = 0;
            datesfStretch(tStretch) = datesf(t);

        else        % Stretch current period
            [newStateF(tStretch,:),NewStateCovF{tStretch}] = update(EstMdlAux,YfStretch(tStretch,:), ...
                currentState,CurrentStateCov);
            wk = wk + 1;
            datesfStretch(tStretch) = datesf(t) + wk;
        end

        currentState = newStateF(tStretch,:)';
        CurrentStateCov = NewStateCovF{tStretch};
    end
end

newStateF is a 48-by-2 block matrix of nowcasted demands and supplies during the subperiods of the holdout period. NewStateCovF is a 48-by-1 cell array of 2-by-2 covariance matrices for each nowcast.

Plot the in-sample demand and supply estimates for periods beyond 1993, and plot the nowcasts.

dtp = dates > datetime(1994,1,1);

figure
h = plot(dates(dtp),states(dtp,:),"-",datesfStretch,newStateF,"--");
hold on
h(3).Color = h(1).Color;
h(4).Color = h(2).Color;
yl = ylim;
hp = patch([datesfStretch(1) datesfStretch(end) datesfStretch(end) datesfStretch(1)], ...
    yl([1,1,2,2]),[0.8 0.8 0.8]);
uistack(hp,'bottom');
legend([h; hp],...
    ["Demand (in-sample)"; "Supply (in-sample)"; "Demand (updated)"; "Supply (updated)"; "Forecast horizon"],...
    Location="best")
hold off

See Also

| | |

Related Topics