Reinforcement Learning - Multiple Discrete Actions

Question

Enrico Anderlini 2019-5-23

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/463693-reinforcement-learning-multiple-discrete-actions

回答： kqha1025 kqha1025 2022-6-23

I would like to use a DQN agent where I have multiple continuous states (or observations) and two action signals, each with three possible values for a total of 9 combinations. For example, see the next lines to understand what I mean:

a = [-2,0,2];
b = [-3,0,3];
[A,B]   = meshgrid(a,b);
actions = reshape(cat(2,A',B'),[],2);

If I want to create discrete actions, I need to convert the matrix into a cell and run the command:

actionInfo = rlFiniteSetSpec(num2cell(actions,2));
actionInfo.Name = 'actions';

Additionally, in DQN, you have a critic, which comprises of a deep neural network. I have created the critic as follows:

% Create a DNN for the critic:
hiddenLayerSize = 48; 
observationPath = [
    imageInputLayer([numObs 1 1],'Normalization','none',...
    'Name','observation')
    fullyConnectedLayer(hiddenLayerSize,'Name','CriticStateFC1')
    reluLayer('Name','CriticReLu1')
    fullyConnectedLayer(hiddenLayerSize,'Name','CriticStateFC2')
    additionLayer(2,'Name','add')
    reluLayer('Name','CriticCommonReLu1')
    fullyConnectedLayer(hiddenLayerSize,'Name','CriticCommonFC1')
    reluLayer('Name','CriticCommonReLu2')
    fullyConnectedLayer(1,'Name','CriticOutput')];
actionPath = [
    imageInputLayer([value 1 1],'Normalization','none','Name','action')
    fullyConnectedLayer(hiddenLayerSize,'Name','CriticActionFC1')];
% Create the layerGraph:
criticNetwork = layerGraph(observationPath);
criticNetwork = addLayers(criticNetwork,actionPath);
% Connect actionPath to obervationPath:
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
% Specify options for the critic representation:
criticOpts = rlRepresentationOptions('LearnRate',1e-03,...
    'GradientThreshold',1,'UseDevice','gpu');
% Create the critic representation using the specified DNN and options:
critic = rlRepresentation(criticNetwork,observationInfo,actionInfo,...
    'Observation',{'observation'},'Action',{'action'},criticOpts);
% set the desired options for the agent:
agentOptions = rlDQNAgentOptions(...
    'SampleTime',dt,...
    'UseDoubleDQN',true,...
    'TargetSmoothFactor',1e-3,...
    'DiscountFactor',0.99,...
    'ExperienceBufferLength',1e7,...
    'MiniBatchSize',128);

My problem is the first image input layer to the action path imageInputLayer([value 1 1],'Normalization','none','Name','action'). I have tried values of 1, 2, 9 and 18 for value, but all results in an error when I run

agent = rlDQNAgent(critic,agentOptions);

This is because actionInfo has a cell of 9 elements, each with a double vector of dimensions [1,2], whereas the imageInputLayer is expecting dimensions [value,1,1].

So, how can I set up a DQN agent in MATLAB with two main discrete action signals, each with three possible values?

Many thanks in advance for the help!

2 个评论
显示无隐藏无

Clemens Fricke 2019-7-11

在 MATLAB Online 中打开

PG.m

Hey,

I am not sure If I should open a new thread for this but since it is very close to this question I will try to ask here first.

I am trying to use the PG Agent with multiple discrete Actions and I have no idea how my last Layer of the action Network should look like.

I have [2,62] Actions (2 Parameters with each 62 discrete states) and the output layer only accepts a positiv integer and not vectors. I have tried 2 for the number of parameters and 124 for the number of possible actions. Both get me the same error:

Error using categorical (line 337)
Could not find unique values in VALUESET using the UNIQUE function.
Error in rl.util.rlLayerRepresentation/buildNetwork (line 719)
                        categorical(ActionValues, ActionValues);
Error in rl.util.rlLayerRepresentation/setLoss (line 175)
                this = buildNetwork(this);
Error in rl.agent.rlPGAgent/setActorRepresentation (line 339)
            actor = setLoss(actor,'cte','EntropyLossWeight',opt.EntropyLossWeight);
Error in rl.agent.rlPGAgent (line 47)
            this = setActorRepresentation(this,actor,opt);
Error in rlPGAgent (line 21)
Agent = rl.agent.rlPGAgent(varargin{:});
Error in DQN (line 67)
agent = rlPGAgent(actor,baseline,agentOpts);
Caused by:
    Error using cell/unique (line 85)
    Cell array input must be a cell array of character vectors.
    

I have attached the file to this Comment.

Enrico Anderlini 2019-8-30

Sorry, but I have just seen this.

Do you have 62 states and 2 actions? Or 2 states, 62 actions? Or 124 actions?

I would not recommend a large number of actions, as it will cause learning problems.

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Emmanouil Tzorakoleftherakis 2019-5-30

1
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/463693-reinforcement-learning-multiple-discrete-actions#answer_377244

在 MATLAB Online 中打开

Hi Enrico,

Try

actionPath = [
    imageInputLayer([1 2 1],'Normalization','none','Name','action')
    fullyConnectedLayer(hiddenLayerSize,'Name','CriticActionFC1')];

Each action in your code is 1x2, which should be reflected in the dimensions of the actionpath input.

I hope this helps.

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

Enrico Anderlini 2019-6-12

Hi Emmanouil ,

many thanks for the help!

It works great. I would add that then you need a reshape block in Simulink, but that is no problem. It is much faster than a mapping with an additional C-coded S-function.

请先登录，再进行评论。

Answer 2

kqha1025 kqha1025 2022-6-23

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/463693-reinforcement-learning-multiple-discrete-actions#answer_991830

Hi,

Do you have any function finding "actions" for a general case with multiple actions, e.g. 3 actions with three arrays respectively? (this case has two action arrays)

Thank you very much.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Reinforcement Learning - Multiple Discrete Actions

2 个评论
显示无隐藏无

采纳的回答

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

更多回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Reinforcement Learning - Multiple Discrete Actions

2 个评论 显示 无隐藏 无

采纳的回答

1 个评论 显示 -1更早的评论隐藏 -1更早的评论

更多回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

2 个评论
显示无隐藏无

1 个评论
显示 -1更早的评论隐藏 -1更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论