- Reward Function: Inspect your environment's step function. Ensure that the reward vector (or structure) includes a non-zero value for the first agent (“rlPPOAgent”).
- Agent Configuration: Make sure “rlPPOAgent” is correctly associated with its environment and policy.
- Environment Setup: You can double-check the environment setup to make sure all agents are interacting with it as intended.
- Training Parameters: Review the training parameters specific to the first agent, like the learning rate and discount factor.
I see a zero mean reward for the first agent in multi-agent RL Toolbox
2 次查看(过去 30 天)
显示 更早的评论
Hello, I have extended the PPO Coverage coverage path planning example of the Matlab for 5 agents. I can see now that always, I have a reward for the first agent, and the problem is always, I see a zero mean reward in the toolbox for the first agent like the following image which is not correct. Do you have any idea what is happening there?

0 个评论
回答(1 个)
TARUN
2025-4-22
I understand that you are experiencing an issue with the reward for the first agent in your multi-agent PPO setup.
Here are a few things you can check to resolve the issue:
These are some of the ways that might help you to fix the problem. If not, please provide the code that you are working with so that I can take a deeper look.
Feel free to refer this documentation on “Agents”:
0 个评论
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Introduction to Installation and Licensing 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!