I have some questions about the minibatchsize attribute of PPO+LSTM

Question

xiang 2024-5-17

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2119721-i-have-some-questions-about-the-minibatchsize-attribute-of-ppo-lstm

回答： Aneela 2024-6-5

Hello everyone

I found this sentence when I searched the content of PPO+LSTM in the official MATLAB documentation：“For a PPO agent, the trajectory length is the MiniBatchSize property of its options object” ，But I have some doubts about this statement。When using PPO+LSTM, does the agent no longer need to sample minbatch from the current experience sequence? How to understand "For a PPO agent, the trajectory length is the MiniBatchSize property of its options object".

Hope to get your answer

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Aneela 2024-6-5

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2119721-i-have-some-questions-about-the-minibatchsize-attribute-of-ppo-lstm#answer_1467786

Hi xiang,

“MiniBatchSize” refers to the number of samples from experience replay buffer that are used for one iteration of training.

For non-recurrent neural networks, these samples are selected randomly because the network treats each input independently.
LSTMs require a sequence of experiences to effectively learn temporal features.
When using PPO with LSTM, the agent emphasizes on managing sequences of experiences to leverage the LSTM's ability to learn from temporally dependent data.

“For a PPO agent, the trajectory length is the “MiniBatchSize” property of its options object”:

A trajectory is a sequence of states, actions, and rewards that an agent experiences in the environment from the start of an episode until a terminal state.
The agent learns from trajectories of experiences.
The “MiniBatchSize” value specifies the length of these trajectories.
If “MiniBatchSize” is set to 50, the LSTM network will be trained on trajectories of experiences where each trajectory is 50 steps long.

For more information on PPO Agent options, please refer to the following MathWorks documentation: https://www.mathworks.com/help/reinforcement-learning/ref/rl.option.rlppoagentoptions.html

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

I have some questions about the minibatchsize attribute of PPO+LSTM

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

I have some questions about the minibatchsize attribute of PPO+LSTM

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论