49 views (last 30 days)

Dear Reader,

This is a continuation of the of the following newsgroup thread: http://se.mathworks.com/matlabcentral/newsreader/view_thread/345189?s_tid=srchtitle I think it is easier to use “answers” than “newsgroup” threads, mostly because of the possibility to label code and links…and also because of the possibility to attach figures (images).

INTRODUCTION

This is an analysis of how to perform multistep predictions on the solar datasets (doc nndatasets). Main improvement compared to previous iterations of this code in the added filtering of the signal. However, results indicate that the predictability can be improved. Unfortunately, it is not even possible to generate a prediction using a generated repetitive dataset which is here based on the solar dataset.

METHOD

I use a main program to have the possibility to test the 9 different training algorithms. Here is the code for main:

clear all

close all

% 1: g_train_method='trainlm';

% 2: g_train_method='trainbfg';

% 3: g_train_method='trainrp';

% 4: g_train_method='trainscg';

% 5: g_train_method='traincgb';

% 6: g_train_method='traincgf';

% 7: g_train_method='traincgp';

% 8: g_train_method='trainoss';

% 9: g_train_method='traingdx';

for i=1:9

current_training_method=i

[NMSEo_1(i),NMSEc(i),NMSEc2(i),toc(i)] = train_diff_method(i)

% test NMSEc - NMSEc2

end

figure;

plot(1:9,NMSEc,1:9,NMSEc2)

xlabel('Method number')

legend('NMSEc, without closed loop training','NMSEc2, with closed loop training')

figure;

plot(toc)

xlabel('Method number')

title('toc')

Then I use a function to evaluate the different training algorithms. One thing that should be noticed is that I used mapminmax to norm the signal. I read somewhere that trainbr requires this. Also, it cant do any harm if all methods use this normalized signal, at least as I see it.

function [NMSEo_1,NMSEc,NMSEc2,totaltime] = train_diff_method(train_method_no)

close all

% train_method_no=1;

if train_method_no==1

g_train_method='trainlm'

end

if train_method_no==2

g_train_method='trainbfg'

end

if train_method_no==3

g_train_method='trainrp'

end

if train_method_no==4

g_train_method='trainscg'

end

if train_method_no==5

g_train_method='traincgb'

end

if train_method_no==6

g_train_method='traincgf'

end

if train_method_no==7

g_train_method='traincgp'

end

if train_method_no==8

g_train_method='trainoss'

end

if train_method_no==9

g_train_method='traingdx'

end

format shortG

% GEH1 = [ 'START TIMING' ]

% STJ1 = [ ' DONE ' ]

tic

plt=0;

T = solar_dataset;

t = cell2mat(T);

% t=[t(477:2400) t(477:2400) t(477:2400)];

% FILTERING, PLOTTING AND MAKING T A CELL ARRAY, THIS TIME BASED ON FILTERED t

[b,a]=butter(2,0.1);

t_filt=filtfilt(b,a,t);

plt=plt+1;

figure(plt);

subplot(211);

plot(1:length(t),t,1:length(t),t_filt);

title('SOLAR DATASET, UNFILTERED AND FILTERED');

subplot(212);

plot(1:length(t),t-t_filt);

title('RESIDUALS; FILTERED SIGNAL SUBRTRACTED FROM RAW SIGNAL');

T=num2cell(t_filt);

T_map=mapminmax(cell2mat(T));

T=num2cell(T_map);

t = cell2mat(T)

[ O , N ] = size(t); % 'O' FOR '"o"utput

% GEH2 = [ ' REMOVE SEMICOLON [ O N] ==> [ 1 2899 ] ' ]

% STJ2 = [ ' DONE ' ]

% GEH3 = [ ' TO INSURE UNBIASED NONTRAINING PREDICTIONS'...

% ' USE DIVIDEBLOCK AND THEN USE Ntrn in NNCORR ' ]

% STJ3 = [ ' DONE ' ]

% GEH4 = [ ' Ntrn = N - 2*round(0.15*N) = 2029 ' ]

% STJ4 = [ ' OK ' ]

% ### One thing I think of initially is the use of training algorithm,

% should one always first use trainlm in all steps (OL and CL) or

% is there a scientific approach that can determine (before training is

% started) if e.g. trainbr would fit a specific signal better? ###

% GEH5 = [ ' I DON"T KNOW. BEST TO BEGIN WITH DEFAULTS' ...

% ' THEN SEE MY SPIELS ON OVERTRAINING, Hub AND MSEREG' ]

% STJ5 = [ ' ON THE SUBJECT OF Hub I FOUND SIGNIFICANT POSTS, I HAVEN'T FOCUSED ON OVERTRAINING YET BUT I WILL LOOG INTO THIS. MSEREG SEEMS TO BE OBSOLETE, ANY SUGGESTIONS WHAT TO USE INSTEAD? ']

MSE00 = var(t',1)

% GEH6 = [ ' TAKE MEAN IF if O > 1 ' ]

% STJ6 = [ ' GOOD POINT. IN THIS CASE O is equal to 1' ]

zt = zscore(t',1)';

plt=plt+1;

figure(plt);

subplot(211);

plot(t);

title('SOLAR DATASET');

subplot(212);

plot(zt);

title('SOLAR DATASET, STANDARDIZED')

% [ Xo, Xoi, Aoi, To ] = preparets( neto, {}, {}, T );

% to = cell2mat( To );

% zto = zscore(to,1);

% varto1 = mean(var(to',1));

% minmaxto = minmax([ to ; zto ])

% GEH7 = [ ' MINMAX zt CHECK FOR OUTLIERS? ' ]

% STJ = [ ' FIRST ITERATION I USE FILTFIT, HOPE TO USE WAVELET FILTERING IN THE NEAR FUTURE ']

% rng('default');

rng('shuffle');

L = floor(0.95*(2*N-1));

% GEH8 = [ 'JUST USE Ntrn, Ltrn FOR SIGLAGS ' ]

% ST8 = [ ' THE CODE BELOW, DOES IT LOOK OK? ']

for i = 1:100

n = randn(1,N);

autocorrn = nncorr( n,n, N-1, 'biased');

sortabsautocorrn = sort(abs(autocorrn));

thresh95(i) = sortabsautocorrn(L);

end

sigthresh95 = mean(thresh95); % 0.2194 % tested with * (times) 3 to decrease used fractions (outside dotted line)

% GEH9 = [ ' UH-OH ...THAT"S NOT WHAT I GET ! ' ]

% STJ9 = [ ' I UNDERSTAND AND AGREE, 0.2194 WAS FROM AN EARLIER SCRIPT ' ]

% GEH10 = [' ALSO GOOD TO KNOW OTHER STATS ']

% STJ10 = [ ' ALWAYS GOOD TO GET A FEEL FOR THE DATA, ANY OTHER PURPOSES? PERHAPS PARAMETERS TO STUDY WHEN THE SIZE OF i IS INCREASED/DECREASED? ' ]

minthresh95 = min(thresh95); % 0.024638

medthresh95 = median(thresh95); % 0.027242

meanthresh95 = mean(thresh95); %0.0273

stdthresh95 = std(thresh95); %0.0011573

maxthresh95 = max(thresh95); %0.030326

% [minthresh95 medthresh95 meanthresh95 stdthresh95 maxthresh95]

% GEH11 = [ 'MIGHT WANT TO KNOW WHAT IF max(i) > 100' ]

% STJ11 = [ ' I HAVE WORKED WITH DIFFERENT SIZES OF i, INCREASING i SHOULD GIVE BETTER STATISTICAL REPRESENTATION OF THE ABOVE PARAMETERS (minthresh95 etc.), HOWEVER, AN INCREASED i CAUSES ONLY MINOR CHANGES OF ABOVE PARAMETERS' ]

autocorrt = nncorr(zt,zt,N-1,'biased');

siglag95 = -1+ find(abs(autocorrt(N:2*N-1))>=sigthresh95);

% This three lines below decreases the length of siglab95 and FD

l_sig=length(siglag95);

r_l_sig=ceil(l_sig/15); % 15 used when three times the length is not used

siglag95=siglag95(1:r_l_sig);

FD=siglag95(2:end);

FD_length=length(FD);

d=max(FD);

% siglag95=0:1:156;

% siglag95 = -1+ find(abs(autocorrt(N:2*N-1))>=0.6);

% ### Memory is maxing out if I use all lag values from the above line

% probably this is one of the most important improvement areas ###

% GEH12 = [ '1. ONLY NEED A SUFFICIENT SUBSET ' ...

% ' 2. I THINK YOUR INITIAL CALCULATION IS FLAWED ' ]

% STJ12 = [ ' THE LENGTH OF siglag95 IS DECREASED, HOWEVER, THE MOST SIGNIFICANT WAVELENGTHS ARE STILL TAKEN INTO ACCOUNT, FOR MORE INFO, SEE: ...

% 'https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a ' ]

plt = plt+1; figure(plt);

hold on

plot(0:N-1, -sigthresh95*ones(1,N),'b--');

plot(0:N-1, zeros(1,N),'k');

plot(0:N-1, sigthresh95*ones(1,N),'b--');

plot(0:N-1, autocorrt(N:2*N-1));

plot(siglag95,autocorrt(N+siglag95),'ro');

title('REDUCED NUMBER OF AUTOCORRELATIONS IN SOLAR DATASET');

% GEH13 = [ ' NOT SURE WHAT THAT MEANS ' ]

% STJ13 = [ 'FOR THIS VERSION OF THE CODE I HAVE REDUCED THE LENGTH OF ' ...

% ' SIGLAG95 TO ONLY INCLUDE ENOUGH DATAPOINTS SO THAT THE MOST SIGNIFICANT WAVELENGTH IS ACCONTED FOR ' ]

% defining the global parameters

% g_train_method='trainbr';

% http://se.mathworks.com/help/nnet/ug/choose-a-multilayer-neural-network-training-function.html

% Why isn't trainbr included in the above page?

% BR seems convincing, see http://www.ncbi.nlm.nih.gov/pubmed/19065804

g_no_epochs=5000;

g_trainRatio=74; % 66.67 74 89 50 61 21_22

g_valRatio=26; % 33.33 26 11 50 39 21_22

g_testRatio=0;

g_min_grad=2e-5;

g_trainmaxfail=100;

Ntrials=5;

% (n)-loop or the for(j)-loop? What difference does these three alternatives give? ###

% GEH14 = [ ' ALL YOU NEED TO KNOW IS WHAT THE RNG STATE IS' ...

% ' FOR A DESIGN YOU WANT TO EPRODUCE, SEE MY CODES' ]

% STJ14 = [ ' I THINK THAT WITH THE APPROACH (FOR AND WHILE LOOPS) I USE IN THIS CODE THE RNG STATE 'shuffle' GIVES THE CORRECT USAGE OF THE SCRIPT']

NFD = length(FD); % 147

LDB = max(FD); % 152

Ns = N-LDB; % 2747

Nseq = Ns*O; % 2747

% Nw = (NFD*O+1)*H+(H+1)*O % moved inside the first loop below

Hub = -1+ceil( (Nseq-O) / (NFD*O+O+1)); % 18

Hmax = ceil(Hub/10) % Hmax = 2 ==> Nseq >>Nw

min_nodes=1;

max_nodes=Hmax;

% GEH15 = [ ' JUSTIFICATION FOR THESE NUMBERS ? ... Hub = ?' ]

% STJ15 = [ ' Hub DEFINED ABOVE' ]

% STJX1 = [ ' WHAT WOULD JUSTIFY MORE THAN ONE HIDDEN LAYER? IT THERE ANY RESULT IN THIS SCRIPT THAT COULD SHOW THIS? E.G. NMSE, PREDICTIVE PEERFORMANCE '...

% [ ' IS THIS A VALID REFERENCE FOR THE QUESTION? ftp://ftp.sas.com/pub/neural/FAQ3.html#A_hl ' ]

% GEH16 = [ ' Ntrials = Number of trials; NOT trails ']

% STJ16 = [ ' CORRECTED ' ]

% GEH17 = [ ' Define narnet( FD, n) in outer loop. Ntrials of initial random weights in inner loop '...

% ' but NOT random datadivision for timeseries .. get nonuniform time spacing' ]

%

% STJ17 = [ ' NARNET IS NOW DEFINED IN OUTER LOOP. WITH THIS COMMENT "I did not try randomizing these states here yet, possible future improvement..." I MEANT THAT I WOULD TRY DIFFERENT LEVELS OF TRN/VAL/TST. ']

% 'DOES NOT net.divideFcn='divideblock GUARANTEE THAT THE TIME

% SPACING IS UNIFORM?'...

% 'DEFINING NARNET IN THE OUT LOOP (AS IT LOOKS LIKE NOW DOES...

% 'NOT GIVE RANDOM WEIGHTS THUS A REPETITION OF Ntrials...

% ' WOULD NOT GIVE ANY ADDITIONAL INFORMATION ']

% GEH18 = ['MORE LATER']

% STJ18 = [ ' OK ']

best_Y=[];

NMSEin=Inf;

i=0;

for n=min_nodes:max_nodes

Nw = (NFD*O+1)*n+(n+1)*O;

% Nw when n = 1 : 154

% Nw when n = 2 : 307

% (HOWEVER, SIZE OF Nw SEEMS TO BE METHOD DEPENDANT)

i=i+1; % Counter for indices inside the second loop (do not tie up anything to n).

for j=1:Ntrials

%%%%start copy

net = narnet(FD, n);

% GEH19 = [ ' USE THE SHORTEST SUBSET AS POSSIBLE ' ]

% STJ19 = [ ' SEE COMMENTS ABOVE ON USING THE SIGNIFICANT WAVELENGTH ' ]

% Setup Division of Data for Training, Validation, Testing

net.trainFcn = g_train_method;

net.divideFcn='divideblock'; % I did not try randomizing these states here yet, possible future improvement...

net.trainParam.epochs = g_no_epochs;

net.divideParam.trainRatio = g_trainRatio/100;

net.divideParam.valRatio = g_valRatio/100;

net.divideParam.testRatio = g_testRatio/100;

net.trainParam.min_grad=g_min_grad;

net.trainParam.max_fail=g_trainmaxfail;

[Xo,Xoi,Aoi,To] = preparets(net,{},{},T);

%%%end copy

% GEH20 = [ 'FOR TIMESERIES PREDICTION TEST SUBSET'...

% ' MUST OCCUR AFTER THE TRAINING AND VAL SUBSETS']

% STJ20 = [ ' IS THIS BECAUSE THE TEST DATA SHOULD HAVE AS SIMILAR CHARATERISTICS AS POSSIBLE TO THE PREDICTED DATA? OR ARE THERE OTHER REASONS? ']

% GEH21 = [ 'WHY 10 ?' ]

% STJ21 = [ 'TO INVESTIGATE IF ERROR WOULD DECREASE. I'VE ALSO TRIED DIFFERENT LEVELS HERE, UP TO 100' ]

[ net ,tr, Yo, Eo, Xof, Aof] = train(net,Xo,To,Xoi,Aoi);

% view(net);

% Yo = net(Xo,Xoi,Aoi);

% Eo = gsubtract(To,Yo);

% GEH22 = [ 'COMMENT PREVIOUS TWO STATEMENTS' ]

% STJ22 = [ 'DONE' ]

NMSEo_temp(i,j) = mse(Eo)/MSE00;

epochs = tr.epoch(end);

% GEH23 = [ ' BETTER TO JUST USE A REASONABLE NUMBER OFDELAYS ' ]

% STJ23 = [ ' SEE COMMENT STJ13 ' ]

[n j NMSEo_temp(i,j) mse(Eo) Nw epochs] % Just to show what step is currently being calculated and to show the important parameters

end

NMSEo=min(NMSEo_temp(i,:));

% GEH24 = [' NO! FIND THE BEST DESIGN, NOT THE BEST AVERAGE!']

% STJ24 = [ ' DONE, SEE ADDED WHILE LOOP BELOW ' ]

% GEH25 = [ ' HOWEVER DO TABULATE SUMMARY STATS OF' ...

% ' MIN, MEDIAN, MEAN, STD AND MAX']

% STJ25 = [ ' FOR THIS VERSION OF THE CODE I HAVE FOCUSED ON THE MINIMUM ERROR ']

if (NMSEo<NMSEin)

fprintf('Best minimmum NMSE so far: %4.6e (for %d Hidden nodes)\n',NMSEo,n);

NMSEin=NMSEo;

best_Y=Yo;

N_use=n;

end

end

% searching for the smallest NMSEo from initial Ntrials

mean_NMSEo=mean(NMSEo_temp');

std_NMSEo=std(NMSEo_temp');

min_NMSEo=min(NMSEo_temp');

RESULTS

I have used the above code and investigated the following trn/val ratios (first number is trn, second is val): 50-50,61-39,74-26,89-11,93.5-6.5. The training method numbeer refers to the following training method (you will also be able to find this in the code for "main"):

% 1: g_train_method='trainlm';

% 2: g_train_method='trainbfg';

% 3: g_train_method='trainrp';

% 4: g_train_method='trainscg';

% 5: g_train_method='traincgb';

% 6: g_train_method='traincgf';

% 7: g_train_method='traincgp';

% 8: g_train_method='trainoss';

% 9: g_train_method='traingdx';

Here is the "winning" NMSEo from the while loop. This NMSEo is selected on the basis that it is smaller (or equal) than the 10 NMSEo (Ntrials) generated in the above for loop.

Then I close the loop and obtain NMSEc, error from closing the loop with no training of the closed loop performed.

Here I have limited the y axis to 0 - 2. As you can see, no values are below 1.

Then I close the loop and also perform training of the closed loop. With this I obtain the following NMSEc2.

Here I have limited the y axis to 0 - 2. NMSEc2 is decreased compared to NMSEc, however, regardless of method I am still far away from any desired value.

Please let me know if you would like me to submit tables of the data instead of figures..(however, I hope the message that I have not obtained a desired low value of NMSEc or NMSEc2 is still clear). In this result section I could show numerous figures that portray the high values of NMSEc or NMSEc2 but I see little meaning is uploading figures with the following appearance:

METHOD - PART 2

So, back to the drawing board. I thought to myself, perhaps the solar dataset cannot be used for prediction. There might be some truth to this, I don't see any repeating pattern in the longer wavelengths (longer wavelengths are round 130 points, for more info see this post: https://se.mathworks.com/matlabcentral/answers/284253-it-is-not-possible-to-predict-wavelengths-longer-than-the-maximum-index-value-of-the-fd-in-narxnet-a). I decided to create a repeating pattern. I used three repetitions of a part of the solar dataset, namely (you can find this text in the function code, it is commented):

t=[t(477:2400) t(477:2400) t(477:2400)];

Start and end of sampling; 477 and 2400 are selected on the basis that the signal should be as continuous as possible. I used filtering and analysis of the diff to determine these values. The content of FD is similar for "part 1" and this "part 2". To not increase computational time too much I decreased Ntrials from 10 to 5. Here is a figure showing the "3*repeated solar dataset", I've added some black rectangles to make it easier to see reoccurring features.

For this analysis I have used a trn/val ratio of 66,67 - 33.33, I thought the model could generate a small NMSEo is the val set were identical to two reoccuring parts of the trn dataset.

RESULTS - PART 2

Here is the NMSEo for the 9 different methods using the "3*repeated solar dataset":

4.1823e-08 1.785e-06 0.00042742 1.031e-05 0.00013799 0.00011407 0.00013801 2.9156e-05 0.00096596

Here is the NMSEc (no training) for the 9 different methods using the "3*repeated solar dataset":

2.2568 243.24 6.013 28.174 10.056 9.0195 10.83 10.871 5.2844

Here is the NMSEc2 (training is performed) for the 9 different methods using the "3*repeated solar dataset":

2.2568 13.607 1.0029 27.614 10.056 8.9929 10.83 1.0002 1.0013

DISCUSSION

I could think of a few things to further develop the approach but, at this stage I would be very thankful for receiving a second opinion. I would be very happy if I've made one or a few simple mistakes that could give me a better NMSEc, I really thought that using a repeated signal as I did in part 2 would give me an improved prediction with a small residual error. If there are no simple fixes here, my next step is to investigate using all of the significant FD obtained from nncorr.

Greg Heath
on 30 May 2016

Here is my preliminary result with default 70/15/15 datadivision and H = 10 hidden nodes. However, NONDEFAULT DIVIDEBLOCK was used to impose TRUE FUTURE PREDICTION WITH CONSTANT TIMESTEPS!

close all, clear all, clc, plt=0, tic

T = solar_dataset ;

t = cell2mat(T) ;

[ O N ] = size(t) % [ 1 2899 ]

zt = zscore(t,1); tzt = [ t ; zt ];

meantzt = mean( tzt' )' %[ 51.727 ; -9.38e-16 ]

vartzt1 = var( tzt', 1)' % [1937.3 ; 1 ]

minmaxtzt = minmax(tzt)

% minmaxtzt = 0 253.8

% -1.1752 4.5911

% Possible outliers

Nout3 = numel(find(abs(zt) > 3)) % 27 ( < 0.93%)

Nout4 = numel(find(abs(zt) > 4)) % 4 ( < 0.14%)

% Keep outliers for now

Ntst = round(0.15*N), Nval = Ntst % 435, 435

Ntrn = N-Nval-Ntst % 2029

trnind = 1:Ntrn; % 1 : 2029

valind = Ntrn+1:Ntrn+Nval; % 2030 : 2464

tstind = Ntrn+Nval+1:N; % 2465 : 2899

ttrn = t(trnind); tval = t(valind); ttst = t(tstind);

plt = plt+1, figure(plt), hold on

plot( trnind, ttrn,'b' )

plot( valind, tval,'g' )

plot( tstind, ttst, 'r' )

legend('TRAIN','VAL','TEST')

title( ' SOLAR DATASET TIMESERIES ' )

rng('default');

Ltrn = floor(0.95*(2*Ntrn-1)) % 3854

imax = 200

for i = 1:imax

n = randn(1,Ntrn);

autocorrn = nncorr( n,n, Ntrn-1, 'biased');

sortabsautocorrn = sort(abs(autocorrn));

thresh95(i) = sortabsautocorrn(Ltrn);

end

For imax = 100 & 200:

minthresh95 = min(thresh95) % 0.0293 0.0293

medthresh95 = median(thresh95)% 0.0325 0.0325

meanthresh95 = mean(thresh95) % 0.0325 0.0325

stdthresh95 = std(thresh95) % 0.0015 0.0014

maxthresh95 = max(thresh95) % 0.0355 0.0355

sigthresh95 = meanthresh95 % 0.0325

autocorrt = nncorr( zt, zt, Ntrn-1, 'biased' );

siglag95 = -1+ find(abs(autocorrt(Ntrn:2*Ntrn-1)) ...

>= sigthresh95);

Nsiglag95 = length(siglag95) % 1591 ~ 0.78*Ntrn

d = min(find(diff(siglag95)>1)) % 35

proof = siglag95(d-1:d+1) % 33 34 38

plt = plt+1; figure(plt);

hold on

plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );

plot(0:Ntrn-1, zeros(1,Ntrn),'k');

plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );

plot(0:Ntrn-1, autocorrt(Ntrn:2*Ntrn-1));

plot( siglag95, autocorrt(Ntrn + siglag95), 'ro' );

title('SIGNIFICANT AUTOCORRELATIONS VALUES')

to = t(d+1:N);

No = N-d %2864

Notst = round(0.15*No) %430

Noval = Notst %430

Notrn = No-Noval-Notst %2004

H = 10 % Default

neto = narnet(1:d,H);

[ Xo, Xoi, Aoi, To ] = preparets(neto,{},{},T);

to = cell2mat(To);

[ O No ] = size(to) % [ 1 2864 ]

varto1 = var(to,1) % 1948.6

[neto tro Yo Eo Xof Aof ] = train(neto,Xo,To,Xoi,Aoi);

view(neto)

% Yo = net(Xo,Xoi,Aoi); Eo = gsubtract(To,To);

NMSEo = mse(Eo)/varto1 % 0.11505

NMSEtrno = tro.best_perf/varto1 % 0.10849

NMSEvalo = tro.best_vperf/varto1 % 0.12139

NMSEtsto = tro.best_tperf/varto1 % 0.13927

Rsqo = 1-NMSEo % 0.88495

Rsqtrno = 1-NMSEtrno % 0.89151

Rsqvalo = 1-NMSEvalo % 0.87861

Rsqtsto = 1-NMSEtsto % 0.86073

plotresponse(To,Yo)

totaltime = toc % 24.0

% The next step is to use a double loop over hidden

% nodes

%

% Hmax <= Hub = (Ntrn*O-O)/(d+O+1) % 54

%

% and initial RNG states (varies initial weights)

Hope this helps.

Greg

Greg Heath
on 31 May 2016

I have found 4 errors in the previous post:

1. d = 34

2. DIVIDEBLOCK WAS NOT IMPLEMETED.

3.Significant lags were obtain via zt instead of ztrn.

4. The RNG was not initialized before training

I will replace it ASAP.

Greg

Greg Heath
on 1 Jun 2016

Edited: Greg Heath
on 1 Jun 2016

close all, clear all, clc, plt=0, tic

% FOR UNBIASED PREDICTION OF FUTURE SOLAR FLARE EVENTS, 70/15/15 DIVIDEBLOCK DATADIVISION IS USED TO INSURE

1. UNIFORM TIME SAMPLING

2. ONLY TRAINING DATA (THE FIRST 70%) IS EXPLICITLY

USED TO CALCULATE NETWORK PARAMETERS

HOWEVER, PERFORMANCE ON VALIDATION DATA (THE FOLLOWING

15%) IS MONITORED TO STOP TRAINING WHEN GENERALIZATION TO

GOOD PERFORMANCE ON NONTRAINING DATA BEGINS TO FAIL

T = solar_dataset ; t = cell2mat(T) ;

[ O N ] = size(t) % [ 1 2899 ]

zt = zscore(t,1); tzt = [ t ; zt ];

meantzt = mean( tzt' )' %[ 51.727 ; -9.38e-16 ]

vartzt1 = var( tzt', 1)' % [1937.3 ; 1 ]

minmaxtzt = minmax(tzt)

% minmaxtzt = 0 253.8

% -1.1752 4.5911

% Possible outliers

Nout3 = numel(find(abs(zt) > 3)) % 27 ( < 0.93%)

Nout4 = numel(find(abs(zt) > 4)) % 4 ( < 0.14%)

% NOTE: WILL CURRENTLY IGNORE OUTLIER MITIGATION

Ntst = round(0.15*N), Nval = Ntst % 435, 435

Ntrn = N-Nval-Ntst % 2029

trnind = 1:Ntrn; % 1 : 2029

valind = Ntrn+1:Ntrn+Nval; % 2030 : 2464

tstind = Ntrn+Nval+1:N; % 2465 : 2899

ttrn = t(trnind); tval = t(valind);ttst = t(tstind);

plt = plt+1, figure(plt), hold on

plot( trnind, ttrn,'b' )

plot( valind, tval,'g' )

plot( tstind, ttst, 'r' )

legend('TRAIN','VAL','TEST')

title( ' SOLAR DATASET TIMESERIES ' )

rng('default');

Ltrn = floor(0.95*(2*Ntrn-1)) % 3854

imax = 200

for i = 1:imax

n = randn(1,Ntrn);

autocorrn = nncorr( n,n, Ntrn-1, 'biased');

sortabsautocorrn = sort(abs(autocorrn));

thresh95(i) = sortabsautocorrn(Ltrn);

end

% imax = 100 200

minthresh95 = min(thresh95) % 0.0293 0.0293

medthresh95 = median(thresh95)% 0.0325 0.0325

meanthresh95 = mean(thresh95) % 0.0325 0.0325

stdthresh95 = std(thresh95) % 0.0015 0.0014

maxthresh95 = max(thresh95) % 0.0355 0.0355

sigthresh95 = meanthresh95 % 0.0325

zttrn = zscore(ttrn',1)';

autocorrttrn = nncorr( zttrn, zttrn, Ntrn-1, 'biased' );

siglag95 = find(abs(autocorrttrn(Ntrn+1:2*Ntrn-1)) ...

>= sigthresh95);

Nsiglag95 = length(siglag95) % 1327 ~ 0.65*Ntrn

d = min(find(diff(siglag95)>1))% 35

proof = siglag95(d-1:d+1) % 34 35 39

plt = plt+1; figure(plt); hold on

plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );

plot(0:Ntrn-1, zeros(1,Ntrn),'k');

plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );

plot(0:Ntrn-1, autocorrttrn(Ntrn:2*Ntrn-1));

plot( siglag95, autocorrttrn(Ntrn + siglag95), 'ro' );

title('SIGNIFICANT AUTOCORRELATIONS VALUES')

to = t(d+1:N);

No = N-d % 2864

Notst = round(0.15*No) % 430

Noval = Notst % 430

Notrn = No-Noval-Notst % 2004

H = 10 % Default

neto = narnet(1:d,H);

neto.divideFcn = 'divideblock';

[ Xo, Xoi, Aoi, To ] = preparets(neto,{},{},T);

to = cell2mat(To); varto1 = var(to,1) % 1948.6

% Desire number of unknown weights Nw = (d+1)*H+(H+1)*O to not exceed number of training equations Notrneq = Notrn*O ==> H < = Hub where

Hub = (Notrn*O-O)/(d+O+1) % 54.14

Hmin = 10, dH = 10, Hmax = 50

Ntrials = 10

rng('default')

j = 0

for h = Hmin:dH:Hmax

j = j+1

neto.layers{1}.dimensions = h;

for i = 1:Ntrials

neto = configure(neto,Xo,To);

[neto tro Yo Eo Xof Aof ]=train(neto,Xo,To,Xoi,Aoi);

%[Yo Xof Aof =net(Xo,Xoi,Aoi);Eo=gsubtract(To,Yo);

NMSEo(i,j) = mse(Eo)/varto1;

NMSEtrno(i,j) = tro.best_perf/varto1;

NMSEvalo(i,j) = tro.best_vperf/varto1;

NMSEtsto(i,j) = tro.best_tperf/varto1;

end

end

plotresponse(To,Yo)

NumH = Hmin:dH:Hmax

Rsqo = 1-NMSEo

Rsqtrno = 1-NMSEtrno

Rsqvalo = 1-NMSEvalo

Rsqtsto = 1-NMSEtsto

totaltime = toc

% Rsqo

% H=10 20 30 40 50

% 0.877 0.881* 0.878 0.790 0.775

% 0.880* 0.883* 0.776 0.744 0.794

% 0.858 0.847 0.768 0.650 0.834

% 0.436 0.886* 0.681 0.859 0.762

% 0.879 0.809 0.516 0.803 0.819

% 0.878 0.714 0.883* 0.646 0.880*

% 0.881* 0.881* 0.860 0.841 0.781

% 0.879 0.864 0.810 0.881* 0.778

% 0.887* 0.879 0.641 0.882* 0.549

% 0.875 0.869 0.883* 0.657 0.714

%

% Rsqtrno

%H=10 20 30 40 50

% 0.900 0.907 0.915 0.877 0.870

% 0.906 0.906 0.905 0.906 0.948

% 0.901 0.939 0.902 0.892 0.865

% 0.902 0.915 0.933 0.929 0.956

% 0.907 0.871 0.904 0.884 0.902

% 0.905 0.895 0.903 0.968 0.903

% 0.899 0.897 0.929 0.950 0.911

% 0.913 0.928 0.839 0.897 0.953

% 0.916 0.896 0.738 0.900 0.883

% 0.902 0.911 0.929 0.943 0.851

%

% Rsqvalo

%H=10 20 30 40 50

% 0.865 0.860 0.833 0.788 0.744

% 0.863 0.860 0.769 0.689 0.627

% 0.835 0.780 0.769 0.742 0.831

% 0.756 0.860 0.780 0.763 0.631

% 0.866 0.811 0.584 0.729 0.748

% 0.864 0.578 0.863 0.553 0.862

% 0.872* 0.875* 0.839 0.771 0.730

% 0.854 0.777 0.797 0.874* 0.770

% 0.855 0.874* 0.553 0.875* 0.687

% 0.865 0.856 0.821 0.730 0.678

%

% Rsqtsto

%H=10 20 30 40 50

% 0.785 0.777 0.752 0.388 0.359

% 0.777 0.801* 0.175 0.039 0.243

% 0.679 0.484 0.140 -0.570 0.688

% -2.06 0.778 -0.597 0.627 -0.014

% 0.764 0.510 -1.363 0.497 0.503

% 0.768 0.006 0.806* -0.762 0.786

% 0.807* 0.814* 0.559 0.406 0.223

% 0.745 0.650 0.685 0.814* -0.031

% 0.783 0.805* 0.274 0.805* -1.150

% 0.758 0.689 0.730 -0.752 0.113

%

%These test Rsquare values are not high enough to obtain

% a useful closed loop configuration by just closing the loop.

% The next step is to continue training the net after closing

% the loop and, if unsuccessful, to consider training the CL

% configuration with a new set of initial random weights.

Hope this helps.

Thank you for formally accepting my answer

Greg

Greg Heath
on 27 May 2016

Edited: Greg Heath
on 28 May 2016

For UNBIASED FUTURE PREDICTIONS of the SOLAR_DATASET, I see the

initial attack structured as follows:

1. DIVIDEBLOCK DATADIVISION

2. DEFAULT 0.7/0.15/0.15 TRN/VAL/TST datadivision ratios

3. TRN subset is known. VAL and TST subsets are unknown.

4. STATIONARY SUMMARY STATISTICS so that the

min/median/mean/max/stdv/siglags

of VAL & TST subsets are ASSUMED to be the same as those of the

TRN subset.

5. Emphasis on estimating a good combination of feedback delays FD =

1:d and number of hidden nodes, H, to obtain Rsqc >= 0.99

(i.e., NMSEc <= 0.01 ).

6. TRAINBR is only considered when the number of training equations,

Ntrneq ~ 0.7*Ntrn*O, is not sufficiently larger than the number of

unknown weights Nw = (d+1)*H+(H+1)*O.

7. Although performance function MSEREG is obsolete and has been

replaced by the REGULARIZATION OPTION in MSE, it is STILL

AVAILABLE! Unfortunately, there is a BUG in TRAINBR that does not

allow the use of a VAL subset. I say unfortunate because recent

investigations have shown that the combination can achieve

superior results!

8. My initial investigations have shown that

a. The summary stats are not stationary.

b. There are outliers which may be significantly detrimental.

9. The next steps are

a. Estimate the significant lags of the TRN subset

b. Try to find combinations of d and H to minimize Nw subject to

the OL (i.e., OPEN LOOP) performance constraint Rsqo >= 0.999

( i.e., NMSEo <= 0.001 ).

c. Determine what has to be done to the OL & CL (CLOSELOOP)

configurations to optimize the CL performance. If necessary,

consider outlier modifications.

Hope this helps.

Thank you for formally accepting my answer

Greg

Staffan
on 11 Aug 2016

Edited: Staffan
on 12 Aug 2016

Greg Heath
on 23 Aug 2016

% Dividetrain --> I’ve put some hours on this; This method does not increase the accuracy of the prediction. I don’t think it is impossible to see improvement using dividetrain, however, I see three detriments: (1) increased likelihood of over fitting, (2) increased computational time and (3) only marginal room for improvement compared to divideint.

(1) I have been operating under the edict: For unbiased timeseries prediction

max(indtrn,indval) < min(indtst)

So far, no success. Frustrated, I have cheated and looked at DIVIDETRAIN mainly to find out why everthing is failing. Surprisingly, even this has failed.

Now I'm pretty sure that the basic problems are

a. Nonstationarity

b. Insufficient use of significant delays.

c. I'm on vacation and my wife wants to denerd me for a week.

(2) Overfitting is never a problem for me because I always

a. Compare the number of training equations Ntrneq with the number of

unknown weights Nw (Search NEWSREADER and ANSWERS using Ntrneq Nw).

b. Try to minimize the number of hidden nodes subject to a maximum allowed

error

(3). So far training time has been no problem at all.

(4). With default delays , nothing works for me. I think skipping the estimation of significant delays is my biggest mistake.

% Divideint 34/33/33 --> At this early stage I still like produce a prediction that I can visualise. I think that parameters such as NMSE or R2 for tst values are applicable but I think such parameters could be more usable in an optimization (at a later stage, so to say). If I can visualise a prediction in a graph I analyse both the accuracy of wavelengths and amplitudes in one go.

% DIVIDE into stationary subsections --> You are here thinking about e.g. % using divideblock 44/26/30 and to be sure that the data is (more) stationary at each of these divisions? Regarding stationary data, do you have a suggestion that you think I should read on this subject? I get the concept of that it’s not that easy to perform predictions on data that contains larger trends and I’ve brushed up on Dickey–Fuller test (from wiki, mostly) but I think there is a better reference more applicable to neural networks than the ones I’ve come across.

My experience with trend removal has been limited to polynomials and the equivalent of replacing variables with differences.

% Difference the series to reduce non-stationarity --> Are you here suggesting that I should perform the NN calculation of the diff of the dataset?...simply : % % T = solar_dataset; % t = cell2mat(T); % t_diff=diff(t); % T_diff=num2cell(t); % … and so forth using t_diff and T_diff instead of t and T

Yes. This is standard procedure. In fact even higher order diferences are used (differences remove linear trends/2nd differences remove quadratic trends, etc).

Frustrating problem.

More later

Greg

Shashank Bhatia
on 12 Aug 2018

I think this post has most from Greg and Staffan. Thank you both for this deeper discussion into the matter on which Mathworks have failed to give appropriate examples.

Staffan it would be best if you can post your final code for this example.

Greg, I request you to please add if you have any further developments on this example.

I am trying to compile all the relevant information regarding NARNET from various threads by Greg to make 1 post which covers the most.

Thanks once again.

Sincerely, Shashank

Saira AlZadjali
on 14 Oct 2018

Dear Greg,

i am working on almost same, predicting wind speed using narnet it have been long time, did lots of changes in network parameters, but all giving almost same accuracy or worse. so read more in deep, in ur answers here. and got to know that we need to do autocorrelation to get the FD. thus i followed the method above for my dataset. but i am getting d value very large :( can you help to find suitable delay and hidden layer neurons, i am getting open loop MSE = range (0.34xx to 0.38xx) for all tries which i did with different Delay and Hidden neurons, with different training function, and algorithms) finally i came to do this autocorrelation method, but here again stuck, d is "8445" which is very huge number.. please advise //

data_1= xlsread('G:\SQU2\Solar_Wind_Project\MAtlabCodes\data');

T = data_1 (1:79146,2);

T=T';

t = T ;

[ O N ] = size(t) % [ 1 79146 ]

zt = zscore(t,1); tzt = [ t ; zt ];

meantzt = mean( tzt' )' %[ 6.2702 ; 0.0000 ]

vartzt1 = var( tzt', 1)' % [10.4444 ; 1.0000 ]

minmaxtzt = minmax(tzt)

% minmaxtzt = 0.2100 18.1810

% -1.8752 3.6855

% Possible outliers

Nout3 = numel(find(abs(zt) > 2)) % 3305

Nout4 = numel(find(abs(zt) > 3)) % 221

% NOTE: WILL CURRENTLY IGNORE OUTLIER MITIGATION

Ntst = round(0.15*N), Nval = Ntst % 11872, 11872

Ntrn = N-Nval-Ntst % 55402

trnind = 1:Ntrn ; % 1 : 55402

valind = Ntrn+1:Ntrn+Nval; % 55403 : 67274

tstind = Ntrn+Nval+1:N; % 67275 : 79146

ttrn = t(trnind); tval = t(valind);ttst = t(tstind);

plt = plt+1, figure(plt), hold on

plot( trnind, ttrn,'b' )

plot( valind, tval,'g' )

plot( tstind, ttst, 'r' )

legend('TRAIN','VAL','TEST')

title( ' WIND SPEED DATASET TIMESERIES ' )

rng('default');

Ltrn = floor(0.95*(2*Ntrn-1)) % 105262

imax = 100

for i = 1:imax

n = randn(1,Ntrn);

autocorrn = nncorr( n,n, Ntrn-1, 'biased');

sortabsautocorrn = sort(abs(autocorrn));

thresh95(i) = sortabsautocorrn(Ltrn);

end

% imax = 100

minthresh95 = min(thresh95) % 0.0061

medthresh95 = median(thresh95)% 0.0062

meanthresh95 = mean(thresh95) % 0.0062

stdthresh95 = std(thresh95) % 5.1053e-05

maxthresh95 = max(thresh95) % 0.0064

sigthresh95 = meanthresh95 % 0.0062

zttrn = zscore(ttrn',1)';

autocorrttrn = nncorr( zttrn, zttrn, Ntrn-1, 'biased' );

siglag95 = find(abs(autocorrttrn(Ntrn+1:2*Ntrn-1)) ...

>= sigthresh95);

Nsiglag95 = length(siglag95) % 52291 ~ 0.9438*Ntrn

d = min(find(diff(siglag95)>1))% 8445

proof = siglag95(d-1:d+1) % 8444 8445 8470

plt = plt+1; figure(plt); hold on

plot(0:Ntrn-1, -sigthresh95*ones(1,Ntrn),'b--' );

plot(0:Ntrn-1, zeros(1,Ntrn),'k');

plot(0:Ntrn-1, sigthresh95*ones(1,Ntrn),'b--' );

plot(0:Ntrn-1, autocorrttrn(Ntrn:2*Ntrn-1));

plot( siglag95, autocorrttrn(Ntrn + siglag95), 'ro' );

title('SIGNIFICANT AUTOCORRELATIONS VALUES')

to = t(d+1:N);

No = N-d % 70701

Notst = round(0.15*No) % 10605

Noval = Notst % 10605

Notrn = No-Noval-Notst % 49491

Opportunities for recent engineering grads.

Apply Today
## 3 Comments

## Direct link to this comment

https://ww2.mathworks.cn/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#comment_369145

⋮## Direct link to this comment

https://ww2.mathworks.cn/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#comment_369145

## Direct link to this comment

https://ww2.mathworks.cn/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#comment_369177

⋮## Direct link to this comment

https://ww2.mathworks.cn/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#comment_369177

## Direct link to this comment

https://ww2.mathworks.cn/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#comment_369298

⋮## Direct link to this comment

https://ww2.mathworks.cn/matlabcentral/answers/285851-suggestions-for-improvement-s-on-narnet-multistep-ahead-predictions-on-the-solar-dataset#comment_369298

Sign in to comment.