Expected reward blows up while training (DDPG agent, reinforcement learning)

4 次查看（过去 30 天）

Sayak Mukherjee 2020-10-12

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/610916-expected-reward-blows-up-while-training-ddpg-agent-reinforcement-learning

编辑： Emmanouil Tzorakoleftherakis 2020-10-12

I am training a DDPG network and after training for around 5000 iterations, the model seems doesnot seem to converge while the expected reward keeps on increasing exponentially. What can be a possible reason and how to solve the issue.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

回答（1 个）

Emmanouil Tzorakoleftherakis 2020-10-12

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/610916-expected-reward-blows-up-while-training-ddpg-agent-reinforcement-learning#answer_511696

编辑：Emmanouil Tzorakoleftherakis 2020-10-12

Hello,

This answer may be helpful.

I would make sure your reward signal outputs values that make sense, and also possibly simplify the critic network.

2 个评论
显示无隐藏无

Sayak Mukherjee 2020-10-12

Thanks for your answer

What does simplifying critic network mean? Does that mean use less nodes and hidden layers?

Emmanouil Tzorakoleftherakis 2020-10-12

That's right

请先登录，再进行评论。

请先登录，再回答此问题。

类别

AI and Statistics Deep Learning Toolbox

在 Help Center 和 File Exchange 中查找有关 Deep Learning Toolbox 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Expected reward blows up while training (DDPG agent, reinforcement learning)

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

2 个评论
显示无隐藏无

另请参阅

类别

标签

Community Treasure Hunt

Expected reward blows up while training (DDPG agent, reinforcement learning)

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

2 个评论 显示 无隐藏 无

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无