Mixed Type Observation Variables in RL

4 次查看(过去 30 天)
Hi.
I want to design a DQN agent to train in an environment that its observation variables consists of 5 continous double variable, a discreat variable with values [0 1] and two discreat variables with values [-1 0 1]. I define the observation info as:
ObsInfo = [
rlNumericSpec([1 5], 'Name', 'X15'), ... % 5 double observation variables
rlFiniteSetSpec([0 1], 'Name', 'X6'), ... % 1 discrete observation variable with values [0 1]
rlFiniteSetSpec([-1 0 1], 'Name', 'X7'), ... % 2 discrete observation variables with values [-1 0 1]
rlFiniteSetSpec([-1 0 1], 'Name', 'X8')
];
ActionInfo = rlFiniteSetSpec([-2, -1, 0, 1, 2]);
Therefore, the reset and step functions returns Observation in the form:
Obs = {[X1; X2; X3; X4; X5], X6, X7, X8}
Then I define a deep neural network as follows:
layers = [
featureInputLayer(8, 'Normalization', 'none', 'Name', 'state') % 8 observation variables
fullyConnectedLayer(100, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(100, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(5, 'Name', 'fc3') % Number of actions
];
dnn = dlnetwork(layers);
critic = rlVectorQValueFunction(dnn,obsInfo,ActionInfo);
However, this code leads to following error:
The number of network input layers must be equal to the number of observation channels in the environment specification object.
Could you please help me to fix this issue? Is the definition of ObsInfo is correct for this type of problem? And also is the architecture of the network is ok?
Thank you.

回答(1 个)

Shantanu Dixit
Shantanu Dixit 2024-9-10
Hi Mahmood,
The issue encountered is due to the mismatch between the observation space and the network input layer. For incorporating both continuous and discrete observations you can try using a single continuous observation space. The discrete observations will take values in a finite set as dictated by the environment. This approach would require changing the environment to output continuous values.
ObsInfo = rlNumericSpec([1, 8], 'Name', 'Observations');
Alternatively, if the observations are to be provided as separate channels as given in the above code, the network needs to be modified to handle multiple input channels. Following steps describe it briefly
  1. Separate Input layers for each observation channel followed by fully connected layers for feature extraction
  2. Concatenation/Adding outputs from the separate channels
  3. Passing the concatenated to the base network for further processing
Below is a reference code for the above (using the same base network as earlier) :
%% 1. Separate input layers for each channel
continuousInput = featureInputLayer(5, 'Normalization', 'none', 'Name', 'continuousInput');
binaryInput = featureInputLayer(1, 'Normalization', 'none', 'Name', 'binaryInput');
ternaryInput1 = featureInputLayer(1, 'Normalization', 'none', 'Name', 'ternaryInput1');
ternaryInput2 = featureInputLayer(1, 'Normalization', 'none', 'Name', 'ternaryInput2');
continuousPath = [
continuousInput
fullyConnectedLayer(10, 'Name', 'fc_continuous')
reluLayer('Name', 'relu_continuous')
];
binaryPath = [
binaryInput
fullyConnectedLayer(5, 'Name', 'fc_binary')
reluLayer('Name', 'relu_binary')
];
ternaryPath1 = [
ternaryInput1
fullyConnectedLayer(5, 'Name', 'fc_ternary1')
reluLayer('Name', 'relu_ternary1')
];
ternaryPath2 = [
ternaryInput2
fullyConnectedLayer(5, 'Name', 'fc_ternary2')
reluLayer('Name', 'relu_ternary2')
];
%% 2. Concatenating outputs from all the channels
concatLayer = concatenationLayer(1, 4, 'Name', 'concat');
% Further processing after concatenation
commonPath = [
fullyConnectedLayer(100, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(100, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(5, 'Name', 'fc3')
];
% Assemble the network
layers = layerGraph();
layers = addLayers(layers, continuousPath);
layers = addLayers(layers, binaryPath);
layers = addLayers(layers, ternaryPath1);
layers = addLayers(layers, ternaryPath2);
layers = addLayers(layers, concatLayer);
layers = addLayers(layers, commonPath);
% Connect the layers
layers = connectLayers(layers, 'relu_continuous', 'concat/in1');
layers = connectLayers(layers, 'relu_binary', 'concat/in2');
layers = connectLayers(layers, 'relu_ternary1', 'concat/in3');
layers = connectLayers(layers, 'relu_ternary2', 'concat/in4');
layers = connectLayers(layers, 'concat', 'fc1');
%% 3. Passing to base network and further processing
dnn = dlnetwork(layers);
ObsInfoContinuous = rlNumericSpec([1 5], 'Name', 'ContinuousObs');
ObsInfoBinary = rlFiniteSetSpec([0 1], 'Name', 'BinaryObs');
ObsInfoTernary1 = rlFiniteSetSpec([-1 0 1], 'Name', 'TernaryObs1');
ObsInfoTernary2 = rlFiniteSetSpec([-1 0 1], 'Name', 'TernaryObs2');
ActionInfo = rlFiniteSetSpec([-2, -1, 0, 1, 2]);
critic = rlVectorQValueFunction(dnn, ...
[ObsInfoContinuous, ObsInfoBinary, ObsInfoTernary1, ObsInfoTernary2], ...
ActionInfo, ...
'ObservationInputNames', {'continuousInput', 'binaryInput', 'ternaryInput1', 'ternaryInput2'});
Refer to the below MathWorks documentation for more information on creating observation info using different channels:

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by