Reinforcement learning DDPG action fluctuations
显示 更早的评论
Upon attempting to train the path following control example in MATLAB, the training process generated the behviour shown in the picture.

- The steering angle is constantly fluctuating.
- The acceleration is also constantly flucutating.
- The reward convergence is very noisy and seems to jump between a high reward and low reward.
What could be causing this issue? This also happened for other projects I used. One method I used was to penalise the fluctuation in the reward function using this term inspired by a paper published by Wang et. al:
10*[ (d/dt(current_action) * d/dt(previous_action) < 0]
Please let me know how to avoid this problem. Thank you very much!
2 个评论
Emmanouil Tzorakoleftherakis
2020-11-17
Hello,
One clarification - the scope signals you are showing on the right, are you getting these during training or after training?
Tech Logg Ding
2020-11-17
采纳的回答
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Policies and Value Functions 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!