How to extract the weights of the actor network (inside the step function in the environment) while training the agent in DDPG RL

3 次查看(过去 30 天)
Hello Everyone,
I am building an LQR type controller. I need to extract the weights of the actor network (which is essentially the feedback K) inside the step function of the enviroment during training. The reason i want to do this is that during training i want to see the K (the actor weights) and add stability condition on the closed loop system. My step function is as follows:
function [nextobs,rwd,isdone,loggedSignals] = step(this,action)
%% I want to extract K (the actor network weights)
loggedSignals = [];
x = this.State;
tspan = 0:0.01:this.Ts;
[t2,xk1] = ode15s(@NDAE_func_ode_RL,tspan,x,this.SYS.options1,action,this.SYS.d1,this);
this.State = xk1(end,:)';
nextobs = this.Cd*xk1(end,:)';
rwd = -x'*this.Qd*x - action'*this.Rd*action - 2*x'*this.Nd*action;
isdone = length(xk1(:,1))<length(tspan) || norm(x) < this.GoalThreshold;
end
Any guidance/suggestions would be highly appreciated.
Thanks,
Nadeem

回答(1 个)

Harsha Vardhan
Harsha Vardhan 2023-11-17
编辑:Harsha Vardhan 2023-11-17
Hi,
I understand that you want to extract the weights of the actor network inside the step function of the environment during training in a DDPG reinforcement learning setup.
To extract the weights, you can follow the below steps:
  • Pass the 'agent' as an argument to the 'step' function.
  • Inside the ‘step’ function, use the 'getActor' method to obtain the actor function approximator from the agent.
  • Use the 'getLearnableParameters' method to extract the actor's learnable parameters (weights).
Please check the modified code below:
function [nextobs,rwd,isdone,loggedSignals] = step(this,action, agent)
%% I want to extract K (the actor network weights)
%Obtain actor function approximator from the agent
actor = getActor(agent);
%Obtain learnable parameters from the actor
params = getLearnableParameters(actor);
loggedSignals = [];
x = this.State;
tspan = 0:0.01:this.Ts;
[t2,xk1] = ode15s(@NDAE_func_ode_RL,tspan,x,this.SYS.options1,action,this.SYS.d1,this);
this.State = xk1(end,:)';
nextobs = this.Cd*xk1(end,:)';
rwd = -x'*this.Qd*x - action'*this.Rd*action - 2*x'*this.Nd*action;
isdone = length(xk1(:,1))<length(tspan) || norm(x) < this.GoalThreshold;
end
For more details, please refer to the following documentation:
  1. Extract actor from reinforcement learning agent: https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlqagent.getactor.html
  2. Create Custom Environment Using Step and Reset: https://www.mathworks.com/help/reinforcement-learning/ug/create-matlab-environments-using-custom-functions.html
  3. Deep Deterministic Policy Gradient (DDPG) Agents – Creation and Training: https://www.mathworks.com/help/reinforcement-learning/ug/ddpg-agents.html
Hope this helps in resolving your query!

类别

Help CenterFile Exchange 中查找有关 Deep Learning Toolbox 的更多信息

产品


版本

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by