- Overfitting a specific set of data
- Different objectives of the agent
- Architectural difference of the neural networks used in the agent
- Exploration vs Exploitation tradeoff
- Incorrectly initialized hyperparameters
why agent failed to get accelerated after training?
1 次查看(过去 30 天)
显示 更早的评论
Hi,
I trained an pre-trained agent in the same environment. I expect that, model should converge faster but it did not happen.
first pic: first training
second pic: with trained agent
it seems agent do the same training once again. My question is why the second one not faster?
agent setting:
agentOpts=rlDQNAgentOptions(...
'UseDoubleDQN',true,...
'MiniBatchSize', 64, ...
'SaveExperienceBufferWithAgent',true);
agentOpts.EpsilonGreedyExploration.EpsilonDecay=1e-3;
agentOpts.EpsilonGreedyExploration.Epsilon=0.9;
agentOpts.CriticOptimizerOptions.LearnRate=0.01;
agentOpts.CriticOptimizerOptions.GradientThreshold=1;
Train_Old_Model = true; % Set to true, to use pre-trained
agentOpts.ResetExperienceBufferBeforeTraining = not(Train_Old_Model);
if Train_Old_Model
% Load experiences from pre-trained agent
load("XYAgent.mat",'agent');
else
%new DQN Agent
agent = rlDQNAgent(critic,agentOpts);
end
traning setting
maxEpisodes = 1300;
maxStepsPerEpisode = 20;
trainOpts = rlTrainingOptions(...
MaxEpisodes=maxEpisodes, ...
MaxStepsPerEpisode=maxStepsPerEpisode, ...
Verbose=false, ...
ScoreAveragingWindowLength=100,...
Plots="training-progress",...
StopTrainingCriteria="EpisodeCount",...
StopTrainingValue=maxEpisodes);
plot(env)
%train
doTraining = true;
if doTraining
% Train the agent.
trainingStats = train(agent,env,trainOpts);
save("XYAgent.mat","agent")
else
% Load the pretrained agent for the example.
load("XYAgent.mat","agent")
end
Thank you!
0 个评论
采纳的回答
Piyush Dubey
2023-6-2
Hi Kun,
There are various reasons because of which an agent may take longer to converge. Various ways by which a model can be saved, and the training can be resumed can be found in the documentation below:
The reasons why a pre-trained agent can take longer in the same environment are:
Above pointers can be used for diagnosing reasons of a slower convergence of the agent.
Hope this helps.
0 个评论
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!