Reinforcement learning actions using DDPG
显示 更早的评论
Greetings. I'm Jason and I'm working on controlling a bipedal using reinforcement learning. I need help to decide between the two methods below using DDPG:
1_ Generate random actions with Noise variance of %10 of my action range based on descriptions of the DDPG noise model
2_ Using a low variance like 0.5 as they have used in have used in MSRA biped and humanoid training with RL.
I really appreciate it if you could help me with this. And in the latter case, the actions are the output of a tanh layer with low variance([-1.5 1.5]), how is it converted into desired actions?
Please consider that I'm pretty sure that the range of actions I have calculated is good enough to solve the problem and also I tried using higher variances but it makes the learning process less stable. Any sugguestions on how I should generate the random actions?
Thanks in advance for your time and consideration
采纳的回答
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Reinforcement Learning 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!