DAMODARAN B.K

Last seen: 4 years 前 | 自 2021 起处于活动状态

Followers: 0 Following: 0

统计学

Feeds

All (2)
MATLAB Answers (2)

提问

Why RL agent performs same actions repeatedly still it does not constitute optimal policy or better episode Q0.Can anyone explain?

4 years 前 | 0 个回答 | 0

0

个回答

提问

Episode Q0 increases exponentially
Can anyone explain why episode Q0 in RL increases exponentially after convergence of reward to a suboptimal policy?

4 years 前 | 1 个回答 | 0

1

个回答