Bipedal walking robot TD3 training example bad convergence

3 次查看（过去 30 天）

Tech Logg Ding 2021-4-6

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/793687-bipedal-walking-robot-td3-training-example-bad-convergence

编辑： Tech Logg Ding 2021-4-6

Hi all,

I have attempted to run the bipedal walking robot example training myself and it converged to an suboptimal solution. I used the TD3 agent training and also used gpu to host my actor and critic.

The final simulation shows that the robot learnt to fall at the start of the simulation. Why does my training produce significantly different results compared to the example? Did hosting the networks on the gpu caused this?

Here's the training plot. Note that the maximum reward was only 35 compared to the 250 shown in the example.

Thank you :)

在 Help Center 和 File Exchange 中查找有关 Robotics 的更多信息

产品

Reinforcement Learning Toolbox

版本

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

Bipedal walking robot TD3 training example bad convergence

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Bipedal walking robot TD3 training example bad convergence

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论