How to know if an RL agent has been updated

Question

Haochen 2024-5-16

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2119486-how-to-know-if-an-rl-agent-has-been-updated

评论： Tejas 2024-5-23

Hi all,

I want to train an RL agent, but would like to make sure that my agent is updated, so I want to ask how to see if the agent has been updated.

For example, in the official example of 'rl/TrainMultipleAgentsForAreaCoverageExample', I extracted the code related to the agent definition, the training and the simulation:

%...
agentA = rlPPOAgent(actor(1),critic(1),opt);
agentB = rlPPOAgent(actor(2),critic(2),opt);
agentC = rlPPOAgent(actor(3),critic(3),opt);
%...
if doTraining
    result = train([agentA,agentB,agentC],env,trainOpts);
else
    load("rlAreaCoverageAgents.mat");
end
%...
rng(0) % reset the random seed
simOpts = rlSimulationOptions(MaxSteps=maxsteps);
experience = sim(env,[agentA,agentB,agentC],simOpts);

However, say after training I would like to do a check on whether the agentA has changed or not:

copy = agentA;
%the above code section where agentA is trained...
disp(copy==agentA)

The result displayed is 1, so agentA has not been changed?

But this is from the official example so I believe the agents should indeed have been trained. And the simulation result also suggests that they have been trained since it takes sufficiently longer for an agent before train() to complete the task than the one after train().

It seems that train() does update agents, but how can I explicitly tell from the variables in my workspace that they are indeed updated? And why he above comparison is not working? Thank you.

Haochen Tao

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Tejas 2024-5-23

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2119486-how-to-know-if-an-rl-agent-has-been-updated#answer_1462196

在 MATLAB Online 中打开

Hi Haochen,

I also encountered the same issue while working with MATLAB R2024a. My understanding is that when a copy of an agent is created using the '=' operator, MATLAB simply creates a reference to the agent instead of a separate copy. As a result, both variables, the copy and agentA, refer to the same memory location. Therefore, even after the completion of training, they continue to point to the same location, which explains why ‘disp(copy == agentA)’ yields 1.

To assess the differences in the agent before and after training, try this workaround:

Right after creating the copy of the agent, save it into a .MAT file.

    copy = agentA; 
    save('copy.mat','copy'); 

Before proceeding with the comparison, load the copied agent from the .MAT file.

    load('copy.mat'); 
    disp(copy == agentA); 

For more information on operations with .MAT files, please refer to the documentation below:

2 个评论
显示无隐藏无

Haochen 2024-5-23

Thank you,

After learning the documentation, my understand is that save() will duplicate the content elsewhere, and after the train() is executed and both 'copy' and 'agentA' are changed in the same way, the load() function will reassign the duplicated content back to 'copy'?

Tejas 2024-5-23

Yes, your understanding is correct.

请先登录，再进行评论。

How to know if an RL agent has been updated

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论
显示无隐藏无

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

How to know if an RL agent has been updated

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

2 个评论 显示 无隐藏 无

更多回答（0 个）

另请参阅

类别

标签

产品

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

2 个评论
显示无隐藏无