How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

Question

houssam deboucha 2024-8-28

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2148454-how-to-compute-the-gradients-of-sac-agent-for-custom-training-in-additon-is-the-target-critics-are

编辑： praguna manvi 2024-9-4

I'm trying to train multi SAC agent using parallel computing, i don't know how to compute the gradients of agents using dlfeval function, knowing that i have created minibatchqueue for data processing. In addition, given that the agents have been created as agent=rlSACAgent(actor1,[critic1,critic2],agentOpts) , should i introduce the critics targets or they are internally handled by MATLAB by specifying the smoothing factor tau or updating frequency of target critic, and how i can update them?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

praguna manvi 2024-9-4

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2148454-how-to-compute-the-gradients-of-sac-agent-for-custom-training-in-additon-is-the-target-critics-are#answer_1510569

编辑：praguna manvi 2024-9-4

在 MATLAB Online 中打开

Hi @houssam deboucha,

The critic and actor networks are updated internally using the “train” function for agents defined as:

agent = rlSACAgent(actor,[critic1,critic2],agentOpts);

You can find an example of training a rlSACAgent in this documentation:

https://www.mathworks.com/help/reinforcement-learning/ug/train-sac-agent-for-ball-balance-control.html#TrainSACAgentForBallBalanceControlExample-2

For custom training you can refer to this documentation, which outlines the functions needed:

https://www.mathworks.com/help/reinforcement-learning/ug/train-reinforcement-learning-policy-using-custom-training.html#TrainRLPolicyUsingCustomTrainLoopExample-6

Typically, you could use “getValue” or “getAction” functions to extract outputs, calculate loss and compute gradients with “dlgradient”. Here is a link to another example with custom training using sampled minibatch experiences:

https://www.mathworks.com/help/reinforcement-learning/ug/custom-training-loop-with-simulink-action-noise.html#CustomTrainingLoopWithSimulinkActionNoiseExample-11

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

How to compute the gradients of SAC agent for custom training. In additon, is the target critics are updated automatically by matlab, given that agent =rlSACAgent()

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论