Error Training Q-Learning agent with Simulink Model

Question

Pietro Gualla 2024-3-7

1
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2091671-error-training-q-learning-agent-with-simulink-model

评论： Pietro Gualla 2024-3-14

I am trying to train a Q-Learning agent on a Simulink Model "QLearnModel" with the following script called "Q_Gym":

states = [1:1:9];

actions = [-5:1:5];

%

observationInfo = rlFiniteSetSpec(states, "Name", "States");

actionInfo = rlFiniteSetSpec(actions, "Name", "Actions");

%

Q = rlTable(observationInfo, actionInfo);

critic = rlQValueFunction(Q,observationInfo,actionInfo);

agentOptions = rlQAgentOptions("SampleTime", 0.05);

agent = rlQAgent(critic,agentOptions);

%

environment = rlSimulinkEnv("QLearnModel","QLearnModel/RL Agent", observationInfo, actionInfo);

%

traininingOptions = rlTrainingOptions();

trainingStats = train(agent,environment,traininingOptions);

and when I run it this huge error list is thrown where "Error using rl.util.rlFiniteSetSpec/getElementIndex: Invalid data specified. The data must be an element of the rlFiniteSetSpec" is the most verbose one.

Error using rl.train.SeriesTrainer/run

There was an error executing the ProcessExperienceFcn for block

"QLearnModel/RL Agent".

Caused by:

Error using rl.function.AbstractFunction/gradient

Unable to compute gradient from function model.

Error in rl.agent.rlQAgent/learn_ (line 230)

[CriticGradient, gradInfo] =

gradient(this.Critic_,lossFcn,...

Error in rl.agent.AbstractAgent/learn (line 29)

this = learn_(this,experience);

Error in rl.util.agentProcessStepExperience (line 6)

learn(Agent,Exp);

Error in

rl.env.internal.FunctionHandlePolicyExperienceProcessor/processExperience_

(line 31)

[this.Policy_,this.Data_] = feval(this.Fcn_,...

Error in

rl.env.internal.ExperienceProcessorInterface/processExperienceInternal_

(line 139)

processExperience_(this,experience,infoData);

Error in

rl.env.internal.ExperienceProcessorInterface/processExperience

(line 78)

stopsim =

processExperienceInternal_(this,experience,simTime);

Error in rl.simulink.blocks.PolicyProcessExperience/stepImpl (line

45)

stopsim =

processExperience(this.ExperienceProcessor_,experience,simTime);

Error in Simulink.Simulation.internal.DesktopSimHelper

Error in Simulink.Simulation.internal.DesktopSimHelper.sim

Error in Simulink.SimulationInput/sim

Error in rl.env.internal.SimulinkSimulator>localSim (line 259)

simout = sim(in);

Error in

rl.env.internal.SimulinkSimulator>@(in)localSim(in,simPkg) (line

171)

simfcn = @(in) localSim(in,simPkg);

Error in MultiSim.internal.runSingleSim

Error in

MultiSim.internal.SimulationRunnerSerial/executeImplSingle

Error in MultiSim.internal.SimulationRunnerSerial/executeImpl

Error in Simulink.SimulationManager/executeSims

Error in Simulink.SimulationManagerEngine/executeSims

Error in rl.env.internal.SimulinkSimulator/simInternal_ (line 172)

simInfo =

executeSims(engine,simfcn,getSimulationInput(this));

Error in rl.env.internal.SimulinkSimulator/sim_ (line 78)

out = simInternal_(this,simPkg);

Error in rl.env.internal.AbstractSimulator/sim (line 30)

out =

sim_(this,simData,policy,processExpFcn,processExpData);

Error in rl.env.AbstractEnv/runEpisode (line 144)

out =

sim(simulator,simData,policy,processExpFcn,processExpData);

Error in rl.train.SeriesTrainer/run (line 64)

out = runEpisode(...

Error in rl.train.TrainingManager/train (line 516)

run(trainer);

Error in rl.train.TrainingManager/run (line 253)

train(this);

Error in rl.agent.AbstractAgent/train (line 187)

trainingResult = run(trainMgr,checkpoint);

Error in Q_Gym (line 20)

trainingStats = train(agent,environment,traininingOptions);

Caused by:

Error using rl.util.rlFiniteSetSpec/getElementIndex

Invalid data specified. The data must be an element of the

rlFiniteSetSpec.

Error in rl.train.TrainingManager/train (line 516)

run(trainer);

Error in rl.train.TrainingManager/run (line 253)

train(this);

Error in rl.agent.AbstractAgent/train (line 187)

trainingResult = run(trainMgr,checkpoint);

Error in Q_Gym (line 20)

trainingStats = train(agent,environment,traininingOptions);

It seems that the "observed observation" is not an element of rlFiniteSetSpec which is the vector defined in the script as:

states = [1:1:9];

What generates the observation is the following Simulink Block that outputs a number between 1 and 9 based on the realisation of two other variables:

I don't understand how the observation could not be in the defined set.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Yatharth 2024-3-14

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2091671-error-training-q-learning-agent-with-simulink-model#answer_1425036

Hi Pietro,

Looking at the issue hand, I think that the issue might stem from how the observation is being generated or interpreted rather than the setup of your states and actions. To further investigate this, I recommend focusing on the following specific areas:

Even though your simulink block is designed to output number between 1 and 9, ensure that the output is indeed an integer. Sometimes, due to floating-point arithmetic or block configuration, the output might not be exactly an integer even if it visually appears to be so. You might need to use a rounding or floor/ceiling block to ensure the output is an integer.
Add a saturation block or a boundary check in your Simulink model to strictly enforce the output to be within the 1 to 9 range. This ensures that even if the calculations produce a value slightly outside this range, it's corrected before being passed as an observation.

A general debugging suggestion: Try to isolate the issue, you can insert, "To Workspace" Block in Simulink to monitor the output of the observation-generating block directly. This allows you to see exactly what values are being produced at each simulation step. Look for any values that are not integers or are outside the 1 to 9 range.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Pietro Gualla 2024-3-14

在 MATLAB Online 中打开

Hi Yatharth,

thank you very much for your suggestion. As for the first one, the observation is already type cast to uint8 while, during these days I came up too with the second idea but unfortunately it did not help.

By the way I also swapped the Simulink block that generates the observation with the following function, which I found way more readable.

function obs = fcn(error_state, acc_state)
matrix = [  1,2,3;
            4,5,6;
            7,8,9   ];
row = -error_state+2;
column = acc_state+2;
if (row >=1 && row<=3 && column >=1 && column<=3)
    obs = int8(matrix(row,column));
else 
    obs = int8(2);
end

If, for an uncanny reason, the state should be a value not in matrix, then it returns 2 (which is a value in states for sure).

Still I get the same exact error. I honestly am at my wit's end since it doesn't make any sense to me.

Thank you again for your time and help!

请先登录，再进行评论。

Error Training Q-Learning agent with Simulink Model

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Error Training Q-Learning agent with Simulink Model

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

1 个评论
显示 -1更早的评论隐藏 -1更早的评论