rlValueFunction error: The number of network input layers must be equal to the number of observation channels in the environment specification object.
13 次查看(过去 30 天)
显示 更早的评论
Hi,
I am currently training the biped robot with PPO algorithm. I used rlValueFunction to create critic, but it keep shows this error:
Error using rl.internal.validate.mapFunctionObservationInput (line 5)
The number of network input layers must be equal to the number of observation channels in the
environment specification object.
Error in rlValueFunction (line 81)
model = rl.internal.validate.mapFunctionObservationInput(model,observationInfo,...
Error in createPPONetworks (line 219)
critic = rlValueFunction(criticNetwork,env.getObservationInfo)
The critic network code and plot image is below:
criticLayerSizes = [600,400]; %400,300
statePath = [
imageInputLayer([31 1 1],'Normalization','none','Name', 'observation')
fullyConnectedLayer(criticLayerSizes(1), 'Name', 'CriticStateFC1', ...
'Weights',2/sqrt(31)*(rand(criticLayerSizes(1),31)-0.5), ...
'Bias',2/sqrt(31)*(rand(criticLayerSizes(1),1)-0.5))
reluLayer('Name','CriticStateRelu1')
fullyConnectedLayer(criticLayerSizes(2), 'Name', 'CriticStateFC2', ...
'Weights',2/sqrt(criticLayerSizes(1))*(rand(criticLayerSizes(2),criticLayerSizes(1))-0.5), ...
'Bias',2/sqrt(criticLayerSizes(1))*(rand(criticLayerSizes(2),1)-0.5))
];
actionPath = [
imageInputLayer([6 1 1],'Normalization','none', 'Name', 'action') %numAct
fullyConnectedLayer(criticLayerSizes(2), 'Name', 'CriticActionFC1', ...
'Weights',2/sqrt(6)*(rand(criticLayerSizes(2),6)-0.5), ...
'Bias',2/sqrt(6)*(rand(criticLayerSizes(2),1)-0.5))
];
commonPath = [
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonRelu1')
fullyConnectedLayer(1, 'Name', 'CriticOutput',...
'Weights',2*5e-3*(rand(1,criticLayerSizes(2))-0.5), ...
'Bias',2*5e-3*(rand(1,1)-0.5))
];
% Connect the layer graph
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = addLayers(criticNetwork, commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
plot(criticNetwork)
% Create critic representation
criticOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-3, ...
'GradientThreshold',1,'L2RegularizationFactor',2e-4);
if useGPU
%criticOptions.UseDevice = 'gpu';
end
critic = rlValueFunction(criticNetwork,env.getObservationInfo)
I couldn't find why that error occurs. Can anyone please offer me some guidance regarding this problem? Thank you very much.
0 个评论
采纳的回答
Ayush Aniket
2024-6-19
Hi Anna,
The error message you are encountering indicates a mismatch between the number of input layers in your critic network and the number of observation channels provided by your environment's observation specifications. Referring to the Biped Robot example as here: https://www.mathworks.com/help/reinforcement-learning/ug/train-biped-robot-to-walk-using-reinforcement-learning-agents.html#TrainBipedRobotToWalkUsingReinforcementLearningAgentsExample-2 ,
the dimension of ObservationInfo of the environment is [29 1], while the input layer of your critic network expects an image with the dimension [31 1 1]. The use of imageInputLayer is typically for image data inputs. If your observations and actions are not image data (which seems to be the case given their dimensions), consider using featureInputLayer instead for a more appropriate representation. This is especially true if your observations and actions are simply vectors of numerical values.
You can adjust the observation input layer as shown below:
statePath = [
featureInputLayer(obsInfo.Dimension(1),'Normalization','none','Name', 'observation')
% ... rest of your layers
];
For the action layer, you should use rlDiscreteCategoricalActor function as the action space is discrete for the environment. You can read more about the function here: https://www.mathworks.com/help/reinforcement-learning/ref/rl.function.rldiscretecategoricalactor.html
Refer to the following link to know the process of creating critic and actor network using rlvalueFunction: https://www.mathworks.com/help/reinforcement-learning/ref/rl.function.rlvaluefunction.html#mw_810cde6a-16f5-47da-b472-acabe4917d18
3 个评论
Ayush Aniket
2024-6-20
Hi Anna,
For creating a critic netowk for PPO agent, we dont need the action information. Your criticNetwork has an additional channel for action inputs which is not required. Refer following documentation link to read about the requirements for rlValueFunction function : https://www.mathworks.com/help/reinforcement-learning/ug/create-policy-and-value-functions.html#mw_90f2c37b-252f-4ce1-902d-1ed95616f1ee
Your code can be modified as follows to create a critic network:
%% CRITIC
% Create the critic network layers
criticLayerSizes = [600,400]; %400,300
statePath = [
featureInputLayer(obsInfo.Dimension(1),'Normalization','none','Name', 'observation')
fullyConnectedLayer(criticLayerSizes(1), 'Name', 'CriticStateFC1', ...
'Weights',2/sqrt(obsInfo.Dimension(1))*(rand(criticLayerSizes(1),obsInfo.Dimension(1))-0.5), ...
'Bias',2/sqrt(obsInfo.Dimension(1))*(rand(criticLayerSizes(1),1)-0.5))
reluLayer('Name','CriticStateRelu1')
fullyConnectedLayer(criticLayerSizes(2), 'Name', 'CriticStateFC2', ...
'Weights',2/sqrt(criticLayerSizes(1))*(rand(criticLayerSizes(2),criticLayerSizes(1))-0.5), ...
'Bias',2/sqrt(criticLayerSizes(1))*(rand(criticLayerSizes(2),1)-0.5))
reluLayer('Name','CriticCommonRelu1')
fullyConnectedLayer(1, 'Name', 'CriticOutput',...
'Weights',2*5e-3*(rand(1,criticLayerSizes(2))-0.5), ...
'Bias',2*5e-3*(rand(1,1)-0.5))
];
% Connect the layer graph
criticNetwork = layerGraph(statePath);
figure(2)
plot(criticNetwork)
hold on
% Create critic representation
criticOptions = rlRepresentationOptions('Optimizer','adam','LearnRate',1e-3, ...
'GradientThreshold',1,'L2RegularizationFactor',2e-4);
critic = rlValueFunction(criticNetwork,env.getObservationInfo)%,env.getactInfo)%,...
In the above code, I have removed the actionPath layer and combined the statePath and commponPath layers. The actionPath layer can used to create the Actor required for the PPO agent using rlDiscreteCategoricalActor function. Refer to the following documentation to read about PPO Agents and the process of creating actor and critic network for the same: https://www.mathworks.com/help/reinforcement-learning/ug/proximal-policy-optimization-agents.html#mw_06cbb093-e547-44e0-9c06-61d829a6c113
Additionaly, MATLAB also has a rlPPOAgent function. You can read about the function here: https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlppoagent.html
更多回答(0 个)
另请参阅
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!