Problems with control of PID parameters using reinforcement learning

Question

jh j 2023-5-15

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1963919-problems-with-control-of-pid-parameters-using-reinforcement-learning

回答： Divyanshu 2024-12-26

Hello,I am currently learning about reinforcement learning. I want to use reinforcement learning to generate my PID controller parameters. But when I build the simulink model and start training, my reward value keeps going to 0 and when I look at the actions generated by the model in simulink it keeps going to null.I have no idea why this is happening and I am looking for some help.Thank you in advance for your assistance.

clc;clear;close all;
open_system('BLDCRL')
Unable to find system or file 'BLDCRL'.
obsInfo = rlNumericSpec([3 1],...
    'LowerLimit',[-inf -inf -inf ]',...
    'UpperLimit',[ inf  inf inf]');
numObservations = obsInfo.Dimension(1);
actInfo = rlNumericSpec([3 1]);
numActions = actInfo.Dimension(1);
env = rlSimulinkEnv('BLDCRL','BLDCRL/RL Agent',...
    obsInfo,actInfo);
env.ResetFcn = @localResetFcn;
Ts = 5;
Tf = 100;
rng(0)
statePath = [
    featureInputLayer(numObservations,'Normalization','none','Name','State')
    fullyConnectedLayer(500,'Name','CriticStateFC1')
    reluLayer('Name','CriticRelu1')
    fullyConnectedLayer(500,'Name','CriticStateFC2')];
actionPath = [
    featureInputLayer(numActions,'Normalization','none','Name','Action')
    fullyConnectedLayer(500,'Name','ActionFC1')];
commonPath = [
    additionLayer(2,'Name','add')
    reluLayer('Name','CriticCommonRelu')
    fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph();
criticNetwork = addLayers(criticNetwork,statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'ActionFC1','add/in2');
% figure
% plot(criticNetwork)
criticOpts = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
% critic = rlQValueFunction(criticNetwork,...
%              obsInfo,actInfo, ...
%              "ActionInputNames",{'Action'},"ObservationInputNames",{'State'},"UseDevice","gpu");
critic = rlRepresentation(criticNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},criticOpts);
actorNetwork = [
    featureInputLayer(numObservations,'Normalization','none','Name','State')
    fullyConnectedLayer(500,'Name','ActorFC1')
    reluLayer('Name','ActorRelu1') 
    fullyConnectedLayer(500,'Name','ActorFC2')
    reluLayer('Name','ActorRelu2') 
    fullyConnectedLayer(numActions,'Name','Action')
    ];
actorOptions = rlRepresentationOptions('LearnRate',1e-05,'GradientThreshold',1);
actor = rlRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},actorOptions);
% actor = rlContinuousDeterministicActor(actorNetwork,obsInfo,actInfo, ...
%     "ObservationInputNames",{'State'},"UseDevice","gpu");
% act = getAction(actor, ...
%     {rand(obsInfo.Dimension)}); 
% act{1}
agentOpts = rlDDPGAgentOptions(...
    'SampleTime',Ts,...
    'TargetSmoothFactor',1e-3,...
    'DiscountFactor',1.0, ...
    'MiniBatchSize',64, ...
    'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.3;
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOpts);
maxepisodes = 500;
maxsteps = ceil(Tf/Ts);
trainOpts = rlTrainingOptions(...
    'MaxEpisodes',maxepisodes, ...
    'MaxStepsPerEpisode',maxsteps, ...
    'ScoreAveragingWindowLength',5, ...
    'Verbose',false, ...
    'Plots','training-progress',...
    'StopTrainingCriteria','AverageReward',...
    'StopTrainingValue',998);
trainingStats = train(agent,env,trainOpts);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Divyanshu 2024-12-26

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1963919-problems-with-control-of-pid-parameters-using-reinforcement-learning#answer_1556495

Hello @jh j,

If you are following some official example of Reinforcement Learning Toolbox, then ensure that the Reinforcement Learning Toolbox is installed properly on your system along with MATLAB. Because the error indicates that the model file 'BLDCRL' is missing.

However, if this is a custom code then ensure that the model 'BLDCRL' is present on MATLAB path.

Additionally, you can take reference of the following example documentation to get more details.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Problems with control of PID parameters using reinforcement learning

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Problems with control of PID parameters using reinforcement learning

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论