Constraints on the actor action outputs in DDPG RL LQR type control

9 次查看(过去 30 天)
Hello Everyone,
I am trying to train an agent for LQR type control. My observation are 59x1 vector of states and my control input is 6x1 vector. Now my control inputs are voltage and power setpoints which need to contrainted. My inputs are [voltage1, power1, voltage2, power2, voltage3, power3]. Now the voltages need to be constrained betweem 0.95--1.05 and each power should be positive with Pmax seperately for each one. I am bit confused on how to enforce these constaints in actor neural network. Any help will be appreciated. My sample code is as follows:
%% Critic neural network
obsPath = featureInputLayer(obsInfo.Dimension(1),Name="obsIn");
actPath = featureInputLayer(actInfo.Dimension(1),Name="actIn");
commonPath = [
concatenationLayer(1,2,Name="concat")
quadraticLayer
fullyConnectedLayer(1,Name="value", ...
BiasLearnRateFactor=0,Bias=0)
];
% Add layers to layerGraph object
criticNet = layerGraph(obsPath);
criticNet = addLayers(criticNet,actPath);
criticNet = addLayers(criticNet,commonPath);
% Connect layers
criticNet = connectLayers(criticNet,"obsIn","concat/in1");
criticNet = connectLayers(criticNet,"actIn","concat/in2");
criticNet = dlnetwork(criticNet);
critic = rlQValueFunction(criticNet, ...
obsInfo,actInfo, ...
ObservationInputNames="obsIn",ActionInputNames="actIn");
getValue(critic,{rand(obsInfo.Dimension)},{rand(actInfo.Dimension)})
%% Actor neural network
Biass = zeros(6,1); % no biasing linear actor
actorNet = [
featureInputLayer(obsInfo.Dimension(1))
fullyConnectedLayer(actInfo.Dimension(1), ...
BiasLearnRateFactor=0,Bias=Biass)
];
actorNet = dlnetwork(actorNet);
actor = rlContinuousDeterministicActor(actorNet,obsInfo,actInfo);
agent = rlDDPGAgent(actor,critic);
getAction(agent,{rand(obsInfo.Dimension)}) %% getting error while executing this line of command
%%

采纳的回答

Shivansh
Shivansh 2023-10-25
Hi Muhammad,
I understand that you are trying to put constraints on values for voltages and power in the actor neural network. I can provide you an example approach and you can modify it according to your model environment and problem requirement.
Since the voltage and power constraints are different, it's better to separate them into two different layers in your actor network. This way, you can apply different constraints to each output.
Now, to enforce the voltage constraints between 0.95 and 1.05, you can add a custom activation layer after the voltage output layer. This activation layer should clamp the output values within the desired range. Here's an example of how you can add the voltage constraints:
voltageOutputLayer = fullyConnectedLayer(2, ...
'BiasLearnRateFactor', 0, 'Bias', Biass(1:2));
voltageActivationLayer = customClampLayer(0.95, 1.05, 'VoltageClamp');
powerOutputLayer = fullyConnectedLayer(4, ...
'BiasLearnRateFactor', 0, 'Bias', Biass(3:6));
actorNet = [
featureInputLayer(obsInfo.Dimension(1))
voltageOutputLayer
voltageActivationLayer
powerOutputLayer
];
In this example, customClampLayer is a custom layer that clamps the values between a specified range. You can implement it as follows:
classdef customClampLayer < nnet.layer.Layer
properties
LowerBound
UpperBound
end
methods
function layer = customClampLayer(lowerBound, upperBound, name)
layer.LowerBound = lowerBound;
layer.UpperBound = upperBound;
layer.Name = name;
end
function Z = predict(layer, X)
Z = max(layer.LowerBound, min(layer.UpperBound, X));
end
end
end
Similarly, you can enforce the power constraints individually for each power output, you can add a custom activation layer after each power output layer. This activation layer should ensure that the power values are positive and do not exceed the maximum power limit.
The given custom activation layers can achieve the required behaviour and constraint the power and voltage values in the given actor neural network. The above approach and code snippet is a starting point, and you can modify it to fit your requirement.
For more information regarding the custom deep learning layer, you can refer to the following documentation https://www.mathworks.com/help/deeplearning/ug/define-custom-deep-learning-intermediate-layers.html.
Hope it helps!
  2 个评论
Emmanouil Tzorakoleftherakis
By the way, another way would be to add upper and lower limits in the action space definition. That said, in order to avoid that agent outputs always hitting the constraints you set, you need to also add a tanh and a scaling layer as final layers of your actor. That way you can scale the output in the desired range before any final clamping happens

请先登录,再进行评论。

更多回答(0 个)

类别

Help CenterFile Exchange 中查找有关 Build Deep Neural Networks 的更多信息

产品


版本

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by