my DDPG agent starts applying one single action

1 次查看（过去 30 天）

Mokhtar 2022-9-12

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1803325-my-ddpg-agent-starts-applying-one-single-action

评论： nick 2023-11-16

Hello, i am new i Deep Reinforcement learning

using RL Toolbox i am trying to train a DDPG agent to go to a position and stay there (start position = 0 , target position = 5), if he goes above 5 or under 0 he will get a big penalty. the agent starts learning and trying different actions for the first 20~30 episodes and then starts to implement the extreme action (+1) (action space[-1 1]) for the next 100 episodes, it is like he found the optimal action to take each step, which is weird because if he keeps applying the action (+1) he gets to the penalty quickly which doesn't make any sense. even if i let it for +1000 Episodes he comes back to the action (+1) everytime. my reward function for now is:

(-0,1*(reference position - actual position)^2) - 100 *( if X <0 or X>5)

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

nick 2023-11-16

Hi Mokhtar,

Kindly specify the environment of the agent. Also what is meant by reference position? Are the start and stop position refering to X coordinates? It would be better if you can share the code.

请先登录，再进行评论。

请先登录，再回答此问题。