RL DDPG isn't learning
1 次查看(过去 30 天)
显示 更早的评论
Hello,
I am trying to train a model to optimise the power generation units. I have six gen units each with different operation cost, on top of that I have a load profile which has to be met by sum of power contributed from each unit. Ofcourse the power deviation is allowed to be in a range of +/- 0.05.
Here is how I have modelled,
I use the RL agent's actions (six actions) as the power per unit, I multiply each one by the gain (base value) to get power in MW, then I use each MW unit as input to the respective operation cost function (quadratic in nature). I use the sum of power minus the load profile (it's magnitude in per unit and its integral), and the load profile as observations to an RL agent. The reward I define as:- r1 = negative square of deviation r2 = positive 2 if the deviation is within +/- 0.05 r3 = negative sum of operation costs Reward = r1+r2+r3
I used DDPG, with learning 0.0001 and 0.001 for actor and critic respectively, Sample time, 0.4 and simulation time 24, experience buffer 1e6 and mini batch size 256.
The training is run for 15000 episodes but doesn't converge to the expected range of power deviation of +/- 0.05 Also one thing observed is the plot of Q0 vs episode number is increasing indefinitely.
What might be a problem? Kindly consider am a beginner in this area, Thank you
0 个评论
回答(0 个)
另请参阅
类别
在 Help Center 和 File Exchange 中查找有关 Deep Learning Toolbox 的更多信息
产品
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!