Centralized vs Decentralized Training for Multi Agent Reinforcement Learning

Question

Kyle 2023-7-28

1
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2002007-centralized-vs-decentralized-training-for-multi-agent-reinforcement-learning

编辑： Yiwen Zhang 2024-10-16

What exactly are the differences in centralized and decentralized training for multi agent reinforcement learning? Is centralized learning the same as the paradigm of CTDE (centralized training and decentralized execution) that is seen in much of the multi agent RL literature? When I run centralized training , the main difference I notice is that it appears that all agents are receiving the same Q0 value, which I believe means they have the same critic. I see that both methods are used in the tutorials, so I'm trying to get a clearer picture of what the differences are and when to use one versus the other.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Ashu 2023-7-31

2
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2002007-centralized-vs-decentralized-training-for-multi-agent-reinforcement-learning#answer_1281122

在 MATLAB Online 中打开

Hi Kyle,

I understand that you want to know the difference between "centralized" and "decentralized" learning strategies in Reinforcement Learning.

In MATLAB, the terms "centralized" and "decentralized" refer to different learning strategies for agent groups. Let's explore the differences between these two strategies:

1. Decentralized Training:

In decentralized training, each agent collects its own set of experiences during the episodes and learns independently from those experiences.
Agents maintain their own critics (value functions) and policies, which are updated based on their own experiences.
There is no sharing of experiences or learning updates between agents.
This approach is suitable when agents have distinct roles or objectives and should learn independently without coordination.

2. Centralized Training:

In centralized training, agents share the collected experiences and learn from them together.
All agents within a specific agent group (as defined by `AgentGroups`) share the same critic (value function) and policy.
The critic is updated based on the collective experiences of all agents in the group, allowing them to learn from a shared knowledge base.
Policies are shared among agents to promote coordination and collaboration.
This approach is useful when agents need to coordinate their actions and learn from a common perspective, such as in cooperative tasks or when there is a need for centralized decision-making.

'AgentGroups' and 'LearningStrategy' must be used together to specify whether agent groups learn in a centralized manner or decentralized manner.

For example, you can use the following command to configure training for three agent groups with different learning strategies. The agents with indices [1,2] and [3,5] learn in a centralized manner, while agent 4 learns in a decentralized manner.

trainOpts = rlMultiAgentTrainingOptions(AgentGroups={[1,2],4,[3,5]}, ...
            LearningStrategy=["centralized","decentralized","centralized"])

The paradigm of CTDE (centralized training and decentralized execution) is indeed related to the concept of centralized training in multi-agent RL. CTDE refers to training agents in a centralized manner, where they share a common critic and policy, but during execution or deployment, agents act independently without communication or coordination.

When to use centralized or decentralized training depends on the problem and the desired behavior of the agents. If coordination and collaboration are essential, centralized training can be beneficial. On the other hand, if agents have distinct roles or should act independently, decentralized training is more appropriate.

Please refer the following documentation of 'rlmultiagenttrainingoptions' to learn more about the usage of 'centralized' and 'decentralized' learning strategies.

https://in.mathworks.com/help/reinforcement-learning/ref/rlmultiagenttrainingoptions.html

I Hope this information was helpful.

3 个评论
显示 1更早的评论隐藏 1更早的评论

Lin 2024-7-22

Hello Ashu:

Can you provide some references for matlab multi-agent example?

Thank you!

Yiwen Zhang 2024-10-16

编辑：Yiwen Zhang 2024-10-16

Hello @Ashu:

I have tried the centralized training, and I extracted all the neural networks of actors and critics in every agents. I found all the actor networks share the same parameters, as well as critic networks. Does each actor or critic using all agents' mini-batches to update itself?

I mean, for example, if there are 3 agents and the mini-batch size of each of them is 128, is 128*3 samples applied for actor or critic training?

Another question is: What's the input of critic network? The state space of each agent or some kinds of joint state space?

请先登录，再进行评论。

Centralized vs Decentralized Training for Multi Agent Reinforcement Learning

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论
显示 1更早的评论隐藏 1更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Centralized vs Decentralized Training for Multi Agent Reinforcement Learning

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

3 个评论 显示 1更早的评论隐藏 1更早的评论

更多回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

3 个评论
显示 1更早的评论隐藏 1更早的评论