RL Agent Action Limits doesn't working.
显示 更早的评论
I am trying to create an RL DDPG agent.
Action has a lower limit of 1 and an upper limit of 3. And the other one has a lower limit of 5 and an upper limit of 25 by using scalingLayer. But I see that the action output go out of that band.
My code is
unction agent = createAgent(observationInfo,actionInfo,Ts)
L = 64; % number of neurons
statePath = [
featureInputLayer(observationInfo.Dimension(1),'Normalization','none','Name','observation')
fullyConnectedLayer(L,'Name','fc1')
reluLayer('Name','relu1')
fullyConnectedLayer(L,'Name','fc2')
additionLayer(2,'Name','add')
reluLayer('Name','relu2')
fullyConnectedLayer(L,'Name','fc3')
reluLayer('Name','relu3')
fullyConnectedLayer(1,'Name','fc4')];
actionPath = [
featureInputLayer(actionInfo.Dimension(1),'Normalization','none','Name','action')
fullyConnectedLayer(L, 'Name', 'fc5')];
criticNetwork = layerGraph(statePath);
criticNetwork = addLayers(criticNetwork, actionPath);
criticNetwork = connectLayers(criticNetwork,'fc5','add/in2');
criticNetwork = dlnetwork(criticNetwork);
criticOptions = rlOptimizerOptions('LearnRate',1e-3,'GradientThreshold',1);
critic = rlQValueFunction(criticNetwork,observationInfo,actionInfo,...
'ObservationInputNames','observation','ActionInputNames','action');
scale = [1 10]';
bias = [2 15]';
actorNetwork = [
featureInputLayer(observationInfo.Dimension(1),'Normalization','none','Name','observation')
fullyConnectedLayer(L,'Name','fc1')
reluLayer('Name','relu1')
fullyConnectedLayer(L,'Name','fc2')
reluLayer('Name','relu2')
fullyConnectedLayer(L,'Name','fc3')
reluLayer('Name','relu3')
fullyConnectedLayer(2,'Name','fc4')
tanhLayer('Name','tanh1')
scalingLayer('Name','ActorScaling1','Scale', scale,'Bias', bias)];
actorNetwork = dlnetwork(actorNetwork);
actorOptions = rlOptimizerOptions('LearnRate',1e-3,'GradientThreshold', 1);
actor = rlContinuousDeterministicActor(actorNetwork,observationInfo,actionInfo);
agentOptions = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'CriticOptimizerOptions',criticOptions,...
'ActorOptimizerOptions',actorOptions,...
'ExperienceBufferLength',1e6);
agentOptions.NoiseOptions.Variance = 0.6;
agentOptions.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOptions);
回答(1 个)
Emmanouil Tzorakoleftherakis
2023-10-13
编辑:Emmanouil Tzorakoleftherakis
2023-11-2
0 个投票
Please take a look at the DDPG algorithm and specifically step 1 here. DDPG promotes exploration by adding noise on top of the agent output. So it depending on the noise characteristics, the agent output may go out of bounds.
First, make sure the noise variance you have makes sense for your applications. For example, you have variance = 0.6 which may be too much for the [1,3] range.
The other thing is you should consider is adding upper and lower limits in the action space definition. You are not providing it here, but if you are running into violations, you are most likely not setting limits there.
Hope this helps
类别
在 帮助中心 和 File Exchange 中查找有关 Reinforcement Learning 的更多信息
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!