Why is the DDPG episode rewards never change during the whole training process?
12 次查看(过去 30 天)
显示 更早的评论
I'm training a DDPG agent using the Reinforcement Learning toolbox on MATLAB R2020a for a path planning problem. But as you can see, the DDPG episode rewards and average rewards never change during 5000 episodes. I used a simple neural networks with 20 neurons and three layers, the learning rate is set to 0.01, and the Gradient Threshold is 1. Then I try to set weights and bias for fully connected layers and change my reward function, but the result is the same.
1 个评论
采纳的回答
Emmanouil Tzorakoleftherakis
2020-5-26
Looks like the scale between Q0 and episode reward is very different. Try unchecking "Show Episode Q0" to see of the episode reward changes. I would then simplify the critic network to make sure it outputs values in a similar scale as the episode reward.
0 个评论
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!