number of look ahead steps in DDPG Agent Options

4 次查看（过去 30 天）

ALOK RANJAN SWAIN 2020-2-21

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/506744-number-of-look-ahead-steps-in-ddpg-agent-options

评论： Dingshan Sun 2022-9-1

I want to know how does the parameter "NumStepsToLookAhead" in rlDDPGAgentOptions from reinforcement learning toolboxof matlab 2019b works?

Whether the look ahead is done on target networks? (like modification in critic objective, from {r+gamma*Qt - Q} to {r+ sum(gamma**i*Qt) -Q}
Or the look ahead is done on reward sampling itself? ( like changing reward "r" from each sample to "r+gamma*r_t+gamma**2*r_t+1+...

Any help is highly appreciated.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

回答（1 个）

Anh Tran 2020-3-1

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/506744-number-of-look-ahead-steps-in-ddpg-agent-options#answer_417996

I am not sure what does reward sampling mean. "NumStepsToLookAhead" in rlDDPGAgentOptions changes the critic's target values in step 5 of DDPG training algorithm.

Assume g is the discount factor, the critic target will be as followed

4 个评论
显示 2更早的评论隐藏 2更早的评论

ALOK RANJAN SWAIN 2020-3-4

Thanks for your help.??

Dingshan Sun 2022-9-1

Could you give a hint how R_t,R_t_1,,R_t+2,...,R_t+n-1 can be obtained in an online off-policy algorithm? Especially for DRL methods that use an experience replay?

请先登录，再进行评论。

请先登录，再回答此问题。

类别

Control Systems Reinforcement Learning Toolbox Environments

在 Help Center 和 File Exchange 中查找有关 Environments 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by

number of look ahead steps in DDPG Agent Options

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

4 个评论
显示 2更早的评论隐藏 2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

number of look ahead steps in DDPG Agent Options

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

4 个评论 显示 2更早的评论隐藏 2更早的评论

另请参阅

类别

标签

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

4 个评论
显示 2更早的评论隐藏 2更早的评论