How to resume train a trained agent?about Q learning agents.

6 次查看(过去 30 天)
When I use Q Learning training, how can I continue to use the agent's experience resume training of the agent after I stop training midway?Will I still be able to see the progress of the training as I did the first time, and modify the parameters before resuming the training again?
by the way ,After the training, how to see the value in the QTable?Every time I open the QTable, all the values are 0.
I don´t understand why... Can somebody help me out on this?
Here is my code
clc;
clear;
mdl = 'FOUR_DG_0331';
open_system(mdl);
agentBlk = ["FOUR_DG_0331/RL Agent1", "FOUR_DG_0331/RL Agent2", "FOUR_DG_0331/RL Agent3", "FOUR_DG_0331/RL Agent4"];
oInfo = rlFiniteSetSpec([123,456,789]);
aInfo = rlFiniteSetSpec([150,160,170]);
aInfo1 = rlFiniteSetSpec([150,170]);
obsInfos = {oInfo,oInfo,oInfo,oInfo};
actInfos = {aInfo1,aInfo,aInfo,aInfo};
env = rlSimulinkEnv(mdl,agentBlk,obsInfos,actInfos);
Ts = 0.01;
Tf = 4;
rng(0);
qTable1 = rlTable(oInfo,aInfo1);
qTable2 = rlTable(oInfo,aInfo);
qTable3 = rlTable(oInfo,aInfo);
qTable4 = rlTable(oInfo,aInfo);
criticOpts = rlRepresentationOptions('LearnRate',0.1);
Critic1 = rlQValueRepresentation(qTable1,oInfo,aInfo1,criticOpts);
Critic2 = rlQValueRepresentation(qTable2,oInfo,aInfo,criticOpts);
Critic3 = rlQValueRepresentation(qTable3,oInfo,aInfo,criticOpts);
Critic4 = rlQValueRepresentation(qTable4,oInfo,aInfo,criticOpts);
%/*Code here for agent option**/
%... ....
%........
agent1 = rlQAgent(Critic1,QAgent_opt);
agent2 = rlQAgent(Critic2,QAgent_opt);
agent3 = rlQAgent(Critic3,QAgent_opt);
agent4 = rlQAgent(Critic4,QAgent_opt);
trainOpts = rlTrainingOptions;
trainOpts.MaxEpisodes = 1000;
trainOpts.MaxStepsPerEpisode = ceil(Tf/Ts);
trainOpts.StopTrainingCriteria = "EpisodeCount";
trainOpts.StopTrainingValue = 1000;
trainOpts.SaveAgentCriteria = "EpisodeCount";
trainOpts.SaveAgentValue = 15;
trainOpts.SaveAgentDirectory = "savedAgents";
trainOpts.Verbose = false;
trainOpts.Plots = "training-progress";
doTraining = false;
if doTraining
stats = train([agent1, agent2, agent3, agent4],env,trainOpts);
else
load(trainOpts.SaveAgentDirectory +"/Agents16.mat",'agent');
simOpts = rlSimulationOptions('MaxSteps',ceil(Tf/Ts));
experience = sim(env,[agent1 agent2 agent3 agent4 ],simOpts)
end

采纳的回答

Emmanouil Tzorakoleftherakis
Hello,
To see how to iew the table values, take a look at the answer here.
Also, you don't have to do anything specific to continue training where you left off - you can simply begin training again and the policy will start from whatever values/weights it had previously.

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Training and Simulation 的更多信息

产品


版本

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by