Expected reward blows up while training (DDPG agent, reinforcement learning)
4 次查看(过去 30 天)
显示 更早的评论
I am training a DDPG network and after training for around 5000 iterations, the model seems doesnot seem to converge while the expected reward keeps on increasing exponentially. What can be a possible reason and how to solve the issue.
0 个评论
回答(1 个)
Emmanouil Tzorakoleftherakis
2020-10-12
编辑:Emmanouil Tzorakoleftherakis
2020-10-12
Hello,
This answer may be helpful.
I would make sure your reward signal outputs values that make sense, and also possibly simplify the critic network.
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Deep Learning Toolbox 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!