Hi,
(Q1) In this context, I wanted to ask if there are any differences between 'predict' and 'predictAndUpdateState' in the prediction step using LSTM, other than the point that 'predict' returns a sequence of predictions while 'predictAndUpdateState' makes predictions one step at a time? I am asking this because 'predict' also updates the network state between each prediction.
(Q2) I am training the LSTM model on the first 900 seconds (training set) and forecasting the response for the next 100 seconds (test set). So, technically, if I am using YPred = predict(net,XTest), this is updating 'net' according to XTest which is the test input and as per the forecasting problem formulation, I do not have the test set and hence shouldn't be using that. I should rather be updating the model and base my predictions on only XTrain and YTrain as follows. But the predictions are not at all good with this. Can you provide some suggestions if using XTest to predict is okay or how to improve the prediction of the following code?
for n = 1:numObs
[net, Y] = predictAndUpdateState(net, XTrain{n});
Y = Y(:, end);
Yseq = [];
for t = 1:numSteps
[net, Y] = predictAndUpdateState(net, Y);
Yseq = cat(2, Yseq, Y);
end
YTest{n} = Yseq;
net = resetState(net);
end
I am using the following network configuration:
numHiddenUnits = 100;
options = trainingOptions('adam', ...
'MaxEpochs',200, ...
'GradientThreshold',1, ...
'InitialLearnRate',0.005, ...
'LearnRateSchedule','piecewise', ...
'LearnRateDropPeriod',125, ...
'LearnRateDropFactor',0.2, ...
'MiniBatchSize',100, ...
'Verbose',1, ...
'Plots','training-progress');