Control the exploration in soft actor-critic

Question

Sayak Mukherjee 2022-3-22

1
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1677984-control-the-exploration-in-soft-actor-critic

回答： Ahmed R. Sayed 2022-10-4

What is the best way to control the exploration in SAC agent. For TD3 agent I used to control the exploration by adjusting the variance parameter of the agent. Is there any such option for the SAC agent. Currently it seems that the agent is exploring more than required.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Ahmed R. Sayed 2022-10-4

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1677984-control-the-exploration-in-soft-actor-critic#answer_1066275

Hi Mukherjee,

You can control the agent exploration by adjusting the entropy temperature options "EntropyWeightOptions" from the rlSACAgentOptions

For example, large values of EntropyWeight encourage the agent to explore the environment or control it by adjusting the temperature learning rate "LearnRate" to reach the target entropy "TargetEntropy" value [1]. In other words, you can use a fixed weight with zero learning rate and so on.

[1] Haarnoja, Tuomas, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, et al. "Soft Actor-Critic Algorithms and Application." Preprint, submitted January 29, 2019. https://arxiv.org/abs/1812.05905.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Control the exploration in soft actor-critic

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Control the exploration in soft actor-critic

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论