Hi, i am using NARX todo multi step prediction of a daily stock market index (Sensex 2003x1 matrix) using another one as input (Nifty 2003x1 matrix). I am having problem with the close loop

4 次查看(过去 30 天)
Hi, i am using NARX to predict a daily stock market index data (Sensex 2003x1 matrix) as target and another daily stock market data(Nifty) as input. I have done it using the example you have shown in:
The code:
%%%newNARX code 24/4/2013
%%1. Importing data
% Matrix of 2003x1 each are
% daily stock market indices data
% of Nifty & Sensex
load Nifty.dat;
load Sensex.dat;
% %%S = load('magdata');
% %%X = con2seq(S.u);
% %%T = con2seq(S.y);
% To scale the data it is converted to its log value:
lognifty = log(Nifty);
logsensex = log(Sensex);
X = tonndata(lognifty,false,false);
T = tonndata(logsensex,false,false);
% X = con2seq(x);
% T = con2seq(t);
%%2. Data preparation
N = 300; % Multi-step ahead prediction
% Input and target series are divided in two groups of data:
% 1st group: used to train the network
inputSeries = X(1:end-N);
targetSeries = T(1:end-N);
% 2nd group: this is the new data used for simulation. inputSeriesVal will
% be used for predicting new targets. targetSeriesVal will be used for
% network validation after prediction
inputSeriesVal = X(end-N+1:end);
targetSeriesVal = T(end-N+1:end); % This is generally not available
%%3. Network Architecture
delay = 2;
neuronsHiddenLayer = 50;
% Network Creation
net = narxnet(1:delay,1:delay,neuronsHiddenLayer);
%%4. Training the network
[Xs,Xi,Ai,Ts] = preparets(net,inputSeries,{},targetSeries);
net = train(net,Xs,Ts,Xi,Ai);
view(net)
Y = net(Xs,Xi,Ai);
% Performance for the series-parallel implementation, only
% one-step-ahead prediction
perf = perform(net,Ts,Y);
%%5. Multi-step ahead prediction
inputSeriesPred = [inputSeries(end-delay+1:end),inputSeriesVal];
targetSeriesPred = [targetSeries(end-delay+1:end), con2seq(nan(1,N))];
netc = closeloop(net);
view(netc)
[Xs,Xi,Ai,Ts] = preparets(netc,inputSeriesPred,{},targetSeriesPred);
yPred = netc(Xs,Xi,Ai);
perf = perform(net,yPred,targetSeriesVal);
figure;
plot([cell2mat(targetSeries),nan(1,N);
nan(1,length(targetSeries)),cell2mat(yPred);
nan(1,length(targetSeries)),cell2mat(targetSeriesVal)]')
legend('Original Targets','Network Predictions','Expected Outputs')
Network predictions are coming very bad.. I guess there is some problem with the close loop's initial input states and initial layer states. please help.

采纳的回答

Greg Heath
Greg Heath 2013-4-25
%% 1. Importing data % Matrix of 2003x1 each are daily stock market indices data of Nifty & Sensex
> load Nifty.dat;
> load Sensex.dat;
% To scale the data it is converted to its log value:
> lognifty = log(Nifty);
> logsensex = log(Sensex);
> X = tonndata(lognifty,false,false);
> T = tonndata(logsensex,false,false);
%% 2. Data preparation
> N = 300; % Multi-step ahead prediction
% Input and target series are divided in two groups of data: % 1st group: used to train the network inputSeries = X(1:end-N); % targetSeries = T(1:end-N); % 2nd group: this is the new data used for simulation. inputSeriesVal will % be used for predicting new targets. targetSeriesVal will be used for % network validation after prediction
Notation:
data = design + test
design = training + validation
Val subsets are used repetetively with Trn subsets to DESIGN a net with a good set of training parameters (e.g., input delays, feedback delays, number of hidden nodes, stopping epoch, etc). The best of multiple designs is, typically, based on indirectly minimizing MSEval.
After the best design is chosen, the nondesign Test subset is used to estimate generalization performance on nondesign data.
By DEFAULT, the data will be divided RANDOMLY into THREE trn/val/tst subsets according to
dividerand( 2003, 0.7, 0.15, 0.15 )
I disagree with the use of dividerand for uniformly spaced time-series. Replace with one of the other divide functions. (When Nval=Ntst =0, I use 'dividetrain'. Otherwise I use , 'divideblock' or 'divideind' to maintain uniform spacing);
> inputSeriesVal = X(end-N+1:end);
> targetSeriesVal = T(end-N+1:end); % This is generally not available
Change "Val" to "Test" since the subsets are only used for performance evaluation (NOT "validation") and not design.
Since a NNTBX BUG will not allow a test subset without a validation subset and visa versa, there are two options
1. Use trn/val/tst (Nval=Ntst = 300) and 'divideblock' or 'divideind'
(recommended)
2. a.Remove the tst subset (Ntst = 300) from training,
b. Do not use a val set (Nval=0)
c. Use 'dividetrain' to only train on training data(Ntrn = 1703).
d. Calculate the test subset performance separately
%% 3.
> Network Architecture delay = 2;
> neuronsHiddenLayer = 50;
Use the autocorrelation function to determine the significant feedback delays. Use the crosscorrelation function to determine the significant input delays.
% Network Creation
> net = narxnet(1:delay,1:delay,neuronsHiddenLayer);
%% 4. Training the network
> [Xs,Xi,Ai,Ts] = preparets(net,inputSeries,{},targetSeries);
% > net = train(net,Xs,Ts,Xi,Ai);
[ net tr Ys Es Xf Af ] = train(net,Xs,Ts,Xi,Ai);
tr = tr % To obtain important info
> view(net)
> Y = net(Xs,Xi,Ai); % Performance for the series-parallel implementation, only
% one-step-ahead prediction
> perf = perform(net,Ts,Y);
%% 5. Multi-step ahead prediction
>inputSeriesPred =[inputSeries(end-delay+1:end),inputSeriesVal];
>targetSeriesPred = [targetSeries(end-delay+1:end), con2seq(nan(1,N))];
>netc = closeloop(net);
>view(netc)
Check netc on previous data. If performance is bad, improve it by training netc on the previous data.
>[Xs,Xi,Ai,Ts] = preparets(netc,inputSeriesPred,{},targetSeriesPred);
>yPred =netc(Xs,Xi,Ai);
>perf = perform(net,yPred,targetSeriesVal);
>figure;
>plot([cell2mat(targetSeries),nan(1,N);
> nan(1,length(targetSeries)),cell2mat(yPred);
> nan(1,length(targetSeries)),cell2mat(targetSeriesVal)]')
>legend('Original Targets','Network Predictions','Expected Outputs')
% Network predictions are coming very bad.. I guess there is some problem % with the close loop's initial input states and initial layer states. % please help.
1. Optimize ID and FD
2. Use trn/val/trn with 'divideblock' or 'divideind'
3. Compare netc and net performance on openloop data
4. If necessary, use train on netc.
5. Then consider nondesign data
Hope this helps
Thank you for formally accepting my answer
Greg

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Sequence and Numeric Feature Data Workflows 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by