Hello Everyone,
I am building an LQR type controller. I need to extract the weights of the actor network (which is essentially the feedback K) inside the step function of the enviroment during training. The reason i want to do this is that during training i want to see the K (the actor weights) and add stability condition on the closed loop system. My step function is as follows:
function [nextobs,rwd,isdone,loggedSignals] = step(this,action)
%% I want to extract K (the actor network weights)
loggedSignals = [];
x = this.State;
tspan = 0:0.01:this.Ts;
[t2,xk1] = ode15s(@NDAE_func_ode_RL,tspan,x,this.SYS.options1,action,this.SYS.d1,this);
this.State = xk1(end,:)';
nextobs = this.Cd*xk1(end,:)';
rwd = -x'*this.Qd*x - action'*this.Rd*action - 2*x'*this.Nd*action;
isdone = length(xk1(:,1))<length(tspan) || norm(x) < this.GoalThreshold;
Any guidance/suggestions would be highly appreciated.