Reinforcement Learning - Multiple Discrete Actions

33 次查看(过去 30 天)
I would like to use a DQN agent where I have multiple continuous states (or observations) and two action signals, each with three possible values for a total of 9 combinations. For example, see the next lines to understand what I mean:
a = [-2,0,2];
b = [-3,0,3];
[A,B] = meshgrid(a,b);
actions = reshape(cat(2,A',B'),[],2);
If I want to create discrete actions, I need to convert the matrix into a cell and run the command:
actionInfo = rlFiniteSetSpec(num2cell(actions,2));
actionInfo.Name = 'actions';
Additionally, in DQN, you have a critic, which comprises of a deep neural network. I have created the critic as follows:
% Create a DNN for the critic:
hiddenLayerSize = 48;
observationPath = [
imageInputLayer([numObs 1 1],'Normalization','none',...
'Name','observation')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticStateFC1')
reluLayer('Name','CriticReLu1')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticStateFC2')
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonReLu1')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticCommonFC1')
reluLayer('Name','CriticCommonReLu2')
fullyConnectedLayer(1,'Name','CriticOutput')];
actionPath = [
imageInputLayer([value 1 1],'Normalization','none','Name','action')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticActionFC1')];
% Create the layerGraph:
criticNetwork = layerGraph(observationPath);
criticNetwork = addLayers(criticNetwork,actionPath);
% Connect actionPath to obervationPath:
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
% Specify options for the critic representation:
criticOpts = rlRepresentationOptions('LearnRate',1e-03,...
'GradientThreshold',1,'UseDevice','gpu');
% Create the critic representation using the specified DNN and options:
critic = rlRepresentation(criticNetwork,observationInfo,actionInfo,...
'Observation',{'observation'},'Action',{'action'},criticOpts);
% set the desired options for the agent:
agentOptions = rlDQNAgentOptions(...
'SampleTime',dt,...
'UseDoubleDQN',true,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',0.99,...
'ExperienceBufferLength',1e7,...
'MiniBatchSize',128);
My problem is the first image input layer to the action path imageInputLayer([value 1 1],'Normalization','none','Name','action'). I have tried values of 1, 2, 9 and 18 for value, but all results in an error when I run
agent = rlDQNAgent(critic,agentOptions);
This is because actionInfo has a cell of 9 elements, each with a double vector of dimensions [1,2], whereas the imageInputLayer is expecting dimensions [value,1,1].
So, how can I set up a DQN agent in MATLAB with two main discrete action signals, each with three possible values?
Many thanks in advance for the help!
  2 个评论
Clemens Fricke
Clemens Fricke 2019-7-11
Hey,
I am not sure If I should open a new thread for this but since it is very close to this question I will try to ask here first.
I am trying to use the PG Agent with multiple discrete Actions and I have no idea how my last Layer of the action Network should look like.
I have [2,62] Actions (2 Parameters with each 62 discrete states) and the output layer only accepts a positiv integer and not vectors. I have tried 2 for the number of parameters and 124 for the number of possible actions. Both get me the same error:
Error using categorical (line 337)
Could not find unique values in VALUESET using the UNIQUE function.
Error in rl.util.rlLayerRepresentation/buildNetwork (line 719)
categorical(ActionValues, ActionValues);
Error in rl.util.rlLayerRepresentation/setLoss (line 175)
this = buildNetwork(this);
Error in rl.agent.rlPGAgent/setActorRepresentation (line 339)
actor = setLoss(actor,'cte','EntropyLossWeight',opt.EntropyLossWeight);
Error in rl.agent.rlPGAgent (line 47)
this = setActorRepresentation(this,actor,opt);
Error in rlPGAgent (line 21)
Agent = rl.agent.rlPGAgent(varargin{:});
Error in DQN (line 67)
agent = rlPGAgent(actor,baseline,agentOpts);
Caused by:
Error using cell/unique (line 85)
Cell array input must be a cell array of character vectors.
I have attached the file to this Comment.
Enrico Anderlini
Enrico Anderlini 2019-8-30
Sorry, but I have just seen this.
Do you have 62 states and 2 actions? Or 2 states, 62 actions? Or 124 actions?
I would not recommend a large number of actions, as it will cause learning problems.

请先登录,再进行评论。

采纳的回答

Emmanouil Tzorakoleftherakis
Hi Enrico,
Try
actionPath = [
imageInputLayer([1 2 1],'Normalization','none','Name','action')
fullyConnectedLayer(hiddenLayerSize,'Name','CriticActionFC1')];
Each action in your code is 1x2, which should be reflected in the dimensions of the actionpath input.
I hope this helps.
  1 个评论
Enrico Anderlini
Enrico Anderlini 2019-6-12
Hi Emmanouil ,
many thanks for the help!
It works great. I would add that then you need a reshape block in Simulink, but that is no problem. It is much faster than a mapping with an additional C-coded S-function.

请先登录,再进行评论。

更多回答(1 个)

kqha1025 kqha1025
kqha1025 kqha1025 2022-6-23
Hi,
Do you have any function finding "actions" for a general case with multiple actions, e.g. 3 actions with three arrays respectively? (this case has two action arrays)
Thank you very much.

产品


版本

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by