- Increase the amount of training data.
- Feature Engineering and feature scaling.
- Experiment with different initializations, learning rates, or optimization algorithms to stabilize training. Monitoring training and validation loss curves can provide insights into model stability.
- Systematically tune hyperparameters using techniques like grid search or random search to find the most suitable values for your specific problem.
Low frequency response from LSTM model
5 次查看(过去 30 天)
显示 更早的评论
How to get low frequency output from LSTM network? Following is the time history response of my input features, which has relatively low frequency component
My LSTM network architecture is as follows:
layers = [
sequenceInputLayer(size(X_train{1}, 1)) % Input Features (F)
lstmLayer(x.num_hidden_units_1, 'OutputMode', 'sequence')
tanhLayer
dropoutLayer(0.05)
lstmLayer(x.num_hidden_units_2, 'OutputMode', 'sequence')
dropoutLayer(0.05)
tanhLayer
fullyConnectedLayer(x.num_layers_ffnn)
tanhLayer
fullyConnectedLayer(1)
];
During the training, my network predictions are plotted with the target output as follows
The network output has a very high frequency output on the valildation data, however when the model is used to predict the test data, it is giving a flat line.
The two major concerns for me are:
1) Why the LSTM network is giving high frequency output even when the input features have relatively low frequency?
2) During the training when the model has high frequency, why is it giving a flat line during testing?
0 个评论
回答(1 个)
Debraj Maji
2023-11-17
I see that you are trying to understand why your LSTM network is giving a high frequency output on training data even though the input features have a low frequency.
The model might have overfitted to the training data and captured noise or specific patterns that are not generalizable. This can result in high-frequency outputs on the training set, but when applied to unseen data, the model fails to generalize, leading to a flat line.
LSTMs are designed to overcome limitations of traditional RNN based architectures as they can capture long term-dependencies in sequential data. They are not inherently bad for low frequency data, however in your case, the network is unable to capture the underlying pattern in the sequence due to the nature of input features. The high frequency pattern in the output is mainly due to the introduction of noise in the output which in turn is a result of inaccuracies in the prediction.
The possible ways to mitigate this error are:
For more information on fine tuning a LSTM you can refer to the following documentation: https://in.mathworks.com/help/deeplearning/ug/long-short-term-memory-networks.html
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Custom Training Loops 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!