RL Agent Action Limits doesn't working.

Question

SEONGJAE SHIN 2023-10-13

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2032784-rl-agent-action-limits-doesn-t-working

编辑： Emmanouil Tzorakoleftherakis 2023-11-2

I am trying to create an RL DDPG agent.

Action has a lower limit of 1 and an upper limit of 3. And the other one has a lower limit of 5 and an upper limit of 25 by using scalingLayer. But I see that the action output go out of that band.

My code is

unction agent = createAgent(observationInfo,actionInfo,Ts)
L = 64; % number of neurons
statePath = [
    featureInputLayer(observationInfo.Dimension(1),'Normalization','none','Name','observation')
    fullyConnectedLayer(L,'Name','fc1')
    reluLayer('Name','relu1')
    fullyConnectedLayer(L,'Name','fc2')
    additionLayer(2,'Name','add')
    reluLayer('Name','relu2')
    fullyConnectedLayer(L,'Name','fc3')
    reluLayer('Name','relu3')
    fullyConnectedLayer(1,'Name','fc4')];
actionPath = [
    featureInputLayer(actionInfo.Dimension(1),'Normalization','none','Name','action')
    fullyConnectedLayer(L, 'Name', 'fc5')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = connectLayers(criticNetwork,'fc5','add/in2');
criticNetwork = dlnetwork(criticNetwork);
 
criticOptions = rlOptimizerOptions('LearnRate',1e-3,'GradientThreshold',1);
critic = rlQValueFunction(criticNetwork,observationInfo,actionInfo,...
    'ObservationInputNames','observation','ActionInputNames','action');
scale = [1 10]';
bias = [2 15]';
actorNetwork = [
    featureInputLayer(observationInfo.Dimension(1),'Normalization','none','Name','observation')
    fullyConnectedLayer(L,'Name','fc1')
    reluLayer('Name','relu1')
    fullyConnectedLayer(L,'Name','fc2')
    reluLayer('Name','relu2')
    fullyConnectedLayer(L,'Name','fc3')
    reluLayer('Name','relu3')
    fullyConnectedLayer(2,'Name','fc4')
    tanhLayer('Name','tanh1')
    scalingLayer('Name','ActorScaling1','Scale', scale,'Bias', bias)];
actorNetwork = dlnetwork(actorNetwork);
actorOptions = rlOptimizerOptions('LearnRate',1e-3,'GradientThreshold', 1);
actor = rlContinuousDeterministicActor(actorNetwork,observationInfo,actionInfo);
agentOptions = rlDDPGAgentOptions(...
    'SampleTime',Ts,...
    'CriticOptimizerOptions',criticOptions,...
    'ActorOptimizerOptions',actorOptions,...
    'ExperienceBufferLength',1e6);
agentOptions.NoiseOptions.Variance = 0.6;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOptions);

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Emmanouil Tzorakoleftherakis 2023-10-13

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2032784-rl-agent-action-limits-doesn-t-working#answer_1332704

编辑：Emmanouil Tzorakoleftherakis 2023-11-2

Please take a look at the DDPG algorithm and specifically step 1 here. DDPG promotes exploration by adding noise on top of the agent output. So it depending on the noise characteristics, the agent output may go out of bounds.

First, make sure the noise variance you have makes sense for your applications. For example, you have variance = 0.6 which may be too much for the [1,3] range.

The other thing is you should consider is adding upper and lower limits in the action space definition. You are not providing it here, but if you are running into violations, you are most likely not setting limits there.

Hope this helps

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

RL Agent Action Limits doesn't working.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

RL Agent Action Limits doesn't working.

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论