photo

James Sorokhaibam


Last seen: 13 days 前 自 2024 起处于活动状态

Followers: 0   Following: 0

统计学

Feeds

排序方式:

提问


High fluctuation in Q0 value for TD3 agent while training.
I am training a TD3 RL agent for pick and place robot. The reward function is, reward = exp(-E/d) where E is the total energy co...

5 months 前 | 1 个回答 | 0

1

个回答