DDPG agent low performance

Question

Armin Norouzi 2021-6-3

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/847105-ddpg-agent-low-performance

Hello everyone,

I am trying to train a DDPG agent for my system, and the goal is to generate actions (mf) to follow desired Torque. The attached figure shows episodic award vs. the number of episodes and plot of the system ( output, error, reward, and action). I set the output range from 5 to 30, but agent steel oscillating around these values. Although the training performance in episodic reward seems to converge, steel I am experiencing oscillatory response.

I would appreciate it if someone could help me with this matter.

here is my reward block in simulink:

It worth mentioning that I am using standar parameters for noise model:

agentOpts = rlDDPGAgentOptions(...
 'SampleTime',Ts,...
 'TargetSmoothFactor',1e-3,...
 'DiscountFactor',0.99, ...
 'MiniBatchSize',64, ...
 'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.05*(25/sqrt(Ts));
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;

here is my actor and critic structure:

L = 500; % number of neurons
statePath = [
    featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
    fullyConnectedLayer(L, 'Name', 'fc1')
    reluLayer('Name', 'relu1')
    fullyConnectedLayer(L, 'Name', 'fc2')
    additionLayer(2,'Name','add')
    reluLayer('Name','relu2')
    fullyConnectedLayer(L, 'Name', 'fc3')
    reluLayer('Name','relu3')
    fullyConnectedLayer(1, 'Name', 'fc4')];
actionPath = [
    featureInputLayer(numActions, 'Normalization', 'none', 'Name', 'action')
    fullyConnectedLayer(L, 'Name', 'fc5')];
actorNetwork = [
    featureInputLayer(numObservations, 'Normalization', 'none', 'Name', 'observation')
    fullyConnectedLayer(L, 'Name', 'fc1')
    reluLayer('Name', 'relu1')
    fullyConnectedLayer(L, 'Name', 'fc2')
    reluLayer('Name', 'relu2')
    fullyConnectedLayer(L, 'Name', 'fc3')
    reluLayer('Name', 'relu3')
    fullyConnectedLayer(numActions, 'Name', 'fc4')
    tanhLayer('Name','tanh1')
    scalingLayer('Name','ActorScaling1','Scale',max(actInfo.UpperLimit))];
actorOptions = rlRepresentationOptions('LearnRate',1e-4,'GradientThreshold',1,'L2RegularizationFactor',1e-4);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
 'Observation',{'observation'},'Action',{'ActorScaling1'},actorOptions);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

DDPG agent low performance

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

DDPG agent low performance

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论