SARSA Reinforcement Learning

版本 1.0.0.0 (117.2 KB) 作者: Bhartendu

Maze solving using SARSA, Reinforcement Learning

关注

5.0

(5)

1.6K 次下载

更新时间 2017/5/24

查看许可证

Refer to 6.4 (Sarsa: On-Policy TD Control), Reinforcement learning: An introduction, RS Sutton, AG Barto , MIT press
In this demo, two different mazes have been solved by Reinforcement Learning technique, SARSA.
State-Action-Reward-State-Action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning.
SARSA, Updation of Action-Value Function:

Q(S{t}, A{t}) := Q(S{t}, A{t}) + α*[ R{t+1} + γ ∗ Q(S{t+1}, A{t+1}) − Q(S{t}, A{t}) ]

Learning rate (α)
The learning rate determines to what extent the newly acquired information will override the old information. A factor of 0 will make the agent not learn anything, while a factor of 1 would make the agent consider only the most recent information.

Discount factor (γ)
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "opportunistic" by only considering current rewards, while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the Q values may diverge.

Note: Convergence is tested on particular examples, in general convergence is not sure for above demo.

引用格式

Bhartendu (2024). SARSA Reinforcement Learning (https://www.mathworks.com/matlabcentral/fileexchange/63089-sarsa-reinforcement-learning), MATLAB Central File Exchange. 检索时间: 2024/12/31.

MATLAB 版本兼容性

创建方式 R2016a

兼容任何版本

平台兼容性

Windows macOS Linux

类别

MATLAB > Mathematics > Graph and Network Algorithms > Shortest Path > Labyrinth problems >

在 Help Center 和 MATLAB Answers 中查找有关 Labyrinth problems 的更多信息

标签添加标签

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

版本	已发布	发行说明
1.0.0.0	2017/5/24		下载