How to know if an RL agent has been updated

8 次查看(过去 30 天)
Hi all,
I want to train an RL agent, but would like to make sure that my agent is updated, so I want to ask how to see if the agent has been updated.
For example, in the official example of 'rl/TrainMultipleAgentsForAreaCoverageExample', I extracted the code related to the agent definition, the training and the simulation:
%...
agentA = rlPPOAgent(actor(1),critic(1),opt);
agentB = rlPPOAgent(actor(2),critic(2),opt);
agentC = rlPPOAgent(actor(3),critic(3),opt);
%...
if doTraining
result = train([agentA,agentB,agentC],env,trainOpts);
else
load("rlAreaCoverageAgents.mat");
end
%...
rng(0) % reset the random seed
simOpts = rlSimulationOptions(MaxSteps=maxsteps);
experience = sim(env,[agentA,agentB,agentC],simOpts);
However, say after training I would like to do a check on whether the agentA has changed or not:
copy = agentA;
%the above code section where agentA is trained...
disp(copy==agentA)
The result displayed is 1, so agentA has not been changed?
But this is from the official example so I believe the agents should indeed have been trained. And the simulation result also suggests that they have been trained since it takes sufficiently longer for an agent before train() to complete the task than the one after train().
It seems that train() does update agents, but how can I explicitly tell from the variables in my workspace that they are indeed updated? And why he above comparison is not working? Thank you.
Haochen Tao

采纳的回答

Tejas
Tejas 2024-5-23
Hi Haochen,
I also encountered the same issue while working with MATLAB R2024a. My understanding is that when a copy of an agent is created using the '=' operator, MATLAB simply creates a reference to the agent instead of a separate copy. As a result, both variables, the copy and agentA, refer to the same memory location. Therefore, even after the completion of training, they continue to point to the same location, which explains why disp(copy == agentA) yields 1.
To assess the differences in the agent before and after training, try this workaround:
  • Right after creating the copy of the agent, save it into a .MAT file.
copy = agentA;
save('copy.mat','copy');
  • Before proceeding with the comparison, load the copied agent from the .MAT file.
load('copy.mat');
disp(copy == agentA);
For more information on operations with .MAT files, please refer to the documentation below:
  2 个评论
Haochen
Haochen 2024-5-23
Thank you,
After learning the documentation, my understand is that save() will duplicate the content elsewhere, and after the train() is executed and both 'copy' and 'agentA' are changed in the same way, the load() function will reassign the duplicated content back to 'copy'?

请先登录,再进行评论。

更多回答(0 个)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by