![photo](/responsive_image/150/150/0/0/0/cache/matlabcentral/profiles/22869456_1624053963591_DEF.jpg)
Ahmed R. Sayed
Followers: 0 Following: 0
Feeds
已回答
is actor-critic agent learning?
Hi, karim bio gassi, From your figure, the discounted reward value is very large. try to rescale it to a certain value [-10, 1...
is actor-critic agent learning?
Hi, karim bio gassi, From your figure, the discounted reward value is very large. try to rescale it to a certain value [-10, 1...
2 years 前 | 0
已回答
Control the exploration in soft actor-critic
Hi Mukherjee, You can control the agent exploration by adjusting the entropy temperature options "EntropyWeightOptions" from t...
Control the exploration in soft actor-critic
Hi Mukherjee, You can control the agent exploration by adjusting the entropy temperature options "EntropyWeightOptions" from t...
2 years 前 | 0
已回答
Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agen...
Is it possible to implement a prioritized replay buffer (PER) in a TD3 agent?
By default, built-in off-policy agents (DQN, DDPG, TD3, SAC, MBPO) use an rlReplayMemory object as their experience buffer. Agen...
2 years 前 | 0
已回答
Modifying the control actions to safe ones before storing in the experience buffer during SAC agent training.
I found the solution: You need to use the Simulink environment and the RL Agent block with the last action port.
Modifying the control actions to safe ones before storing in the experience buffer during SAC agent training.
I found the solution: You need to use the Simulink environment and the RL Agent block with the last action port.
2 years 前 | 0
| 已接受
提问
Modifying the control actions to safe ones before storing in the experience buffer during SAC agent training.
Hello everyone, I am implementing a safe off-policy DRL SAC algorithm. Using an iterative convex optimization algorithm moves a...
3 years 前 | 1 个回答 | 0