Deep reinforcement learning for multi-agents
显示 更早的评论
By the multi-agent deep reinforcement learning toolbox, three agents are trained. The reward changes are as shown in the picture. Why do agents' rewards decrease and converge to an unfavorable situation after the reward increases and they move towards desired performance? I expected the process of increasing the rewards and achieving the desired goal to continue as the episode progresses. According to the picture, from episode 700, agents converge to undesired situations, and they didn't change their states.
Thank you.

采纳的回答
更多回答(0 个)
类别
在 帮助中心 和 File Exchange 中查找有关 Environments 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!