solve critic overestimate and how to explore specific action range

2 次查看（过去 30 天）

dani ansari 2023-9-26

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2025927-solve-critic-overestimate-and-how-to-explore-specific-action-range

hello

im using a ddpg agent to tune a robot controller.all of my rewards are negetive and my critic learning rate is 0.01 and my actor learning rate is 0.0001 with adan optimizer and my gradient tresholds are 1. i have tow questions :

1- when my action ange is between [0.00001 0.2] my q0 predict a negetive value too(although with a large bias over actual value) but when my action range is between[0.00001 0.5] my critic have large overstimating around big positive values. why this happen with using bigger action range?

2- i define my action range between [0.00001 0.5] but i know my best action sit somewhere about [0.1 0.2] most of the time. how should i define my actor to explore this range more? is this related to noise option? how should i define ornstein-ohlenbeck noise option to explore this area?

在 Help Center 和 File Exchange 中查找有关 Reinforcement Learning 的更多信息

产品

版本

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

solve critic overestimate and how to explore specific action range

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

solve critic overestimate and how to explore specific action range

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论