DQN agent throw an error
1 次查看(过去 30 天)
显示 更早的评论
I have a DQN agent model (attached) where I run it by:
load_system('RL_model.slx')
initilize_RLmodel
obsInfo = rlNumericSpec([23 1]);
obsInfo.Name = 'observations';
numObservations = obsInfo.Dimension(1);
oad('./action_space.mat');
actInfo = rlFiniteSetSpec(action_space);
env = rlSimulinkEnv('RL_model','RL_model/RL Agent',obsInfo,actInfo);
env.ResetFcn = @(in)localResetFcn(in);
initialObs = reset(env);
actInfo = getActionInfo(env);
numActions = actInfo.Dimension(1);
Ts = 1; Tf = 1;
initOpts = rlAgentInitializationOptions('NumHiddenUnit',500);
agentObj = rlDQNAgent(obsInfo,actInfo,initOpts);
maxepisodes = 1e5; maxsteps = ceil(Tf/Ts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',maxepisodes, ...
'MaxStepsPerEpisode',maxsteps, ...
'ScoreAveragingWindowLength',20, ...
'Verbose',false, ...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',100,'plots','training-progress');
trainingStats = train(agentObj,env,trainOpts);
The system is an equation with some hyper parameters that I want to optimize. The system's output is a number that, if it is greater than a threshold (here set at 5) and the reference label is +1 (detected), it receives a +100 reward; otherwise, if the label and system output do not match, it receives a -100 reward. It's similar to a classifier in which I optimize the coefficients.
However, every time after the 63rd episode, it throws this error, and I have no idea what is wrong.
Error using rl.train.SeriesTrainer/run
There was an error executing the ProcessExperienceFcn for block "RL_model/RL Agent".
Caused by:
Error using rl.function.AbstractFunction/gradient
Unable to compute gradient from function model.
Error in rl.agent.rlDQNAgent/criticLearn_ (line 263)
criticGradient = gradient(this.Critic_,lossFcn,...
Error in rl.agent.rlDQNAgent/learnFromBatchData_ (line 191)
criticGradient = criticLearn_(this, minibatch, maskIdx);
Error in rl.agent.AbstractOffPolicyAgent/learnFromBatchData (line 43)
this = learnFromBatchData_(this,batchData,maskIdx);
Error in rl.agent.rlDQNAgent/learnFromExperiencesInMemory_ (line 183)
learnFromBatchData(this,minibatch,maskIdx);
Error in rl.agent.mixin.InternalMemoryTrainable/learnFromExperiencesInMemory (line 32)
learnFromExperiencesInMemory_(this);
Error in rl.agent.AbstractOffPolicyAgent/learn_ (line 69)
learnFromExperiencesInMemory(this);
Error in rl.agent.AbstractAgent/learn (line 29)
this = learn_(this,experience);
Error in rl.util.agentProcessStepExperience (line 6)
learn(Agent,Exp);
Error in rl.env.internal.FunctionHandlePolicyExperienceProcessor/processExperience_ (line 31)
[this.Policy_,this.Data_] = feval(this.Fcn_,...
Error in rl.env.internal.ExperienceProcessorInterface/processExperienceInternal_ (line 137)
processExperience_(this,experience,getEpisodeInfoData(this));
Error in rl.env.internal.ExperienceProcessorInterface/processExperience (line 78)
stopsim = processExperienceInternal_(this,experience,simTime);
Error in rl.simulink.blocks.PolicyProcessExperience/stepImpl (line 45)
stopsim = processExperience(this.ExperienceProcessor_,experience,simTime);
Error in Simulink.Simulation.internal.DesktopSimHelper
Error in Simulink.Simulation.internal.DesktopSimHelper.sim
Error in Simulink.SimulationInput/sim
Error in rl.env.internal.SimulinkSimulator>localSim (line 259)
simout = sim(in);
Error in rl.env.internal.SimulinkSimulator>@(in)localSim(in,simPkg) (line 171)
simfcn = @(in) localSim(in,simPkg);
Error in MultiSim.internal.runSingleSim
Error in MultiSim.internal.SimulationRunnerSerial/executeImplSingle
Error in MultiSim.internal.SimulationRunnerSerial/executeImpl
Error in Simulink.SimulationManager/executeSims
Error in Simulink.SimulationManagerEngine/executeSims
Error in rl.env.internal.SimulinkSimulator/simInternal_ (line 172)
simInfo = executeSims(engine,simfcn,getSimulationInput(this));
Error in rl.env.internal.SimulinkSimulator/sim_ (line 78)
out = simInternal_(this,simPkg);
Error in rl.env.internal.AbstractSimulator/sim (line 30)
out = sim_(this,simData,policy,processExpFcn,processExpData);
Error in rl.env.AbstractEnv/runEpisode (line 144)
out = sim(simulator,simData,policy,processExpFcn,processExpData);
Error in rl.train.SeriesTrainer/run (line 32)
out = runEpisode(...
Error in rl.train.TrainingManager/train (line 429)
run(trainer);
Error in rl.train.TrainingManager/run (line 218)
train(this);
Error in rl.agent.AbstractAgent/train (line 83)
trainingResult = run(trainMgr,checkpoint);
Error in RunRLDQN (line 49)
trainingStats = train(agentObj,env,trainOpts);
Caused by:
Error using reshape
Number of elements must not change. Use [] as one of the size inputs to automatically calculate the appropriate size for that dimension.
Error in rl.train.TrainingManager/train (line 429)
run(trainer);
Error in rl.train.TrainingManager/run (line 218)
train(this);
Error in rl.agent.AbstractAgent/train (line 83)
trainingResult = run(trainMgr,checkpoint);
Error in RunRLDQN (line 49)
trainingStats = train(agentObj,env,trainOpts);
% %
I would appreciate any assistance or suggestions.
3 个评论
Varun
2024-1-23
Hey, I tried to reproduce the error you are getting but could not proceed since I do not have the data2.csv file mentioned in your MATLAB scripts. If you could share the relevant data files, I can take a look at them!
回答(1 个)
Himanshu
2024-1-24
编辑:Himanshu
2024-1-24
To my understanding, your DQN agent is throwing the error ‘Unable to compute gradient from function model’.
This error is caused if the critical network is constructed in an unexpected order. Specifically, if the action input layers are added before the observation input layers, then this error will occur.
This error will not occur if you are following any of the examples for Reinforcement Learning Toolbox, such as Train DDPG Agent to Swing Up and Balance Pendulum with Image Observation:
0 个评论
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!