problems encountered in DDPG.

28 次查看(过去 30 天)
邓龙京
邓龙京 2024-11-8,12:32
编辑: Walter Roberson 2024-11-18,5:41
I set a simulink environment for using DDPG to suppress sub-oscillations. I follw the tutorial on matlab(DDPG), there are some errors when the program running.
The error message is:
错误使用 rlQValueFunction
Number of input layers for state-action-value function deepneural network must equal the number of observation andaction specifications.
why this happening???I tried to set my obsInfo and actInfo be [1 1] or 1,but all of this tring are failed, even their error message are same. So I can sure the reason of error is not relate to obsInfo = rlNumericSpec([3 1]); actInfo = rlNumericSpec([1 1]).
My code is:
mdl = 'rlVSG';
open_system(mdl);
% 定义观察空间和动作空间
obsInfo = rlNumericSpec([3 1]); % 假设观察空间维度为 3
actInfo = rlNumericSpec([1 1]); % DDPG 用于连续动作空间,这里假设动作空间维度为 1
% 设置环境
env = rlSimulinkEnv(mdl, [mdl '/Subsystem2/RL Agent'], obsInfo, actInfo);
rng(0)
% 定义 Actor 网络
actorLayers = [
featureInputLayer(prod(obsInfo.Dimension))
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(1)]; % 输出一个连续动作值
actorNet = dlnetwork(actorLayers);
summary(actorNet)
% 创建 Actor 对象
actor = rlContinuousDeterministicActor( ...
actorNet, ...
obsInfo, ...
actInfo);
% 定义 Critic 网络
criticLayers = [
featureInputLayer(prod(actInfo.Dimension))
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(200)
reluLayer
fullyConnectedLayer(1)]; % 输出一个连续动作值
% 创建 dlnetwork 时直接传入所有层
criticNet = dlnetwork(criticLayers);
summary(criticNet);
% 创建 Critic 对象
critic = rlQValueFunction(...
criticNet,...
obsInfo, ...
actInfo);
% 设置优化器选项
actorOpts = rlOptimizerOptions('LearnRate', 1e-4, 'GradientThreshold', 0.3);
criticOpts = rlOptimizerOptions('LearnRate', 1e-3, 'GradientThreshold', 0.2);
% 设置 DDPG 智能体选项
agentOpts = rlDDPGAgentOptions( ...
'SampleTime', 0.05, ...
'MiniBatchSize', 256, ...
'DiscountFactor', 0.99, ...
'ExperienceBufferLength', 1e6, ...
'ActorOptimizerOptions', actorOpts, ...
'CriticOptimizerOptions', criticOpts, ...
'UseTargetNetwork', true, ...
'TargetSmoothFactor', 1e-3, ...
'LearnRate', 1e-4);
% 创建 DDPG 智能体
agent = rlDDPGAgent(actor, critic, agentOpts); % 使用 actor 和 critic 对象而非网络
% 设置训练选项
trainOpts = rlTrainingOptions( ...
'MaxEpisodes', 1000, ...
'MaxStepsPerEpisode', 800, ...
'StopTrainingCriteria', 'AverageReward', ...
'StopTrainingValue', 2000, ...
'SaveAgentCriteria', 'AverageReward', ...
'SaveAgentValue', 2000);
% 训练智能体
trainingStats = train(agent, env, trainOpts);
% 设置仿真选项
simOptions = rlSimulationOptions('MaxSteps', 1000);
% 仿真智能体
sim(env, agent, simOptions);
Checked the example on matlab, I set a common layers
There are part of code that I modify:
% 定义 Critic 网络
% 观察输入层和动作输入层
obsInputLayer = featureInputLayer(prod(obsInfo.Dimension),Name="obsInput"); % 观察空间输入层
actInputLayer = featureInputLayer(prod(actInfo.Dimension),Name="actInput"); % 动作空间输入层
% 使用 ConcatenationLayer 合并观察和动作的输入
criticLayers = [concatenationLayer(1,2,Name="concat")
fullyConnectedLayer(200, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(200, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(1, 'Name', 'qValue')];
% 创建 Critic 网络
criticNet = dlnetwork;
criticNet = addLayers(criticNet, obsInputLayer);
criticNet = addLayers(criticNet, actInputLayer);
criticNet = addLayers(criticNet, criticLayers);
criticNet = connectLayers(criticNet,"obsInput","concat/in1");
criticNet = connectLayers(criticNet,"actInput","concat/in2");
summary(criticNet);
% 创建 Critic 对象
critic = rlQValueFunction(criticNet, obsInfo, actInfo);
The error message is:
Error using dlnetwork argument list is invalid. The function requires 1 additional input.
How can solve this probelm?

回答(1 个)

MULI
MULI 2024-11-18,4:42
The error message “Number of input layers for state-action-value function deep neural network must equal the number of observation and action specifications suggests that
  • The critic network should have distinct input layers for observations and actions.
  • These input layers must be combined using a concatenation layer before they proceed through the rest of the network.
You can modify your critic network setup as given below:
% Define observation and action input layers
obsInputLayer = featureInputLayer(prod(obsInfo.Dimension), 'Name', 'obsInput');
actInputLayer = featureInputLayer(prod(actInfo.Dimension), 'Name', 'actInput');
% Define the layers for the critic network
criticLayers = [
concatenationLayer(1, 2, 'Name', 'concat')
fullyConnectedLayer(200, 'Name', 'fc1')
reluLayer('Name', 'relu1')
fullyConnectedLayer(200, 'Name', 'fc2')
reluLayer('Name', 'relu2')
fullyConnectedLayer(1, 'Name', 'qValue')];
% Create the critic network
criticNet = layerGraph();
criticNet = addLayers(criticNet, obsInputLayer);
criticNet = addLayers(criticNet, actInputLayer);
criticNet = addLayers(criticNet, criticLayers);
% Connect the input layers to the concatenation layer
criticNet = connectLayers(criticNet, 'obsInput', 'concat/in1');
criticNet = connectLayers(criticNet, 'actInput', 'concat/in2');
% Convert to dlnetwork
criticNet = dlnetwork(criticNet);
% Create Critic object
critic = rlQValueFunction(criticNet, obsInfo, actInfo);
The second error message Error using dlnetwork argument list is invalid. The function requires 1 additional input. typically occurs when:
  • The layers are not correctly added to the dlnetwork object.
  • The above solution will address this by correctly setting up the layer graph and converting it to a dlnetwork.
Hope this helps!

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by