Using time as a negative reward in RL toolbox

2 次查看（过去 30 天）

显示更早的评论

Amin Moradi 2022-2-24

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1658035-using-time-as-a-negative-reward-in-rl-toolbox

回答： Kartik Saxena 2023-11-30

在 MATLAB Online 中打开

I want to use RL toolbox to train a DQN agent. Right now, i'm using the related step_function to implement the reward function. The problem is I don't know how to punish the agent for taking too long to do the objective. How should I add time to my reward function in this toolbox? Your help is appreciated.

function [NextObs,Reward,IsDone,LoggedSignals] = WW6_StepFunction_genloss(Action,LoggedSignals)
a = Action;
obj=4;
d=[1 2];
state = LoggedSignals.State;
[next_state, ~, genloss]=attack_eff_WW6(state, a, d);
LoggedSignals.State = next_state;
NextObs = LoggedSignals.State;
Down=nnz(~next_state);
IsDone = Down==11;
Reward=genloss;
end

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

回答（1 个）

Kartik Saxena 2023-11-30

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1658035-using-time-as-a-negative-reward-in-rl-toolbox#answer_1362957

Hi,

I understand that you want to add time penalty in the reward function to punish it for taking too long.

The example given below in the MathWorks documentation would be useful for this purpose:

https://www.mathworks.com/help/reinforcement-learning/ug/create-matlab-environments-using-custom-functions.html

You can refer to it and introduce penalty in your reward function by deducting from the reward as per your requirements, instead of adding '1'.

I hope this resolves your issue.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

类别

Control Systems Reinforcement Learning Toolbox Environments

在 Help Center 和 File Exchange 中查找有关 Environments 的更多信息

标签

产品

MATLAB

版本

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Translated by