SARSA Reinforcement Learning

版本 1.0.0.0 (117.2 KB) 作者: Bhartendu
Maze solving using SARSA, Reinforcement Learning

1.4K 次下载

更新时间 2017/5/24

查看许可证

Refer to 6.4 (Sarsa: On-Policy TD Control), Reinforcement learning: An introduction, RS Sutton, AG Barto , MIT press
In this demo, two different mazes have been solved by Reinforcement Learning technique, SARSA.
State-Action-Reward-State-Action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning.
SARSA, Updation of Action-Value Function:

Q(S{t}, A{t}) := Q(S{t}, A{t}) + α*[ R{t+1} + γ ∗ Q(S{t+1}, A{t+1}) − Q(S{t}, A{t}) ]

Learning rate (α)
The learning rate determines to what extent the newly acquired information will override the old information. A factor of 0 will make the agent not learn anything, while a factor of 1 would make the agent consider only the most recent information.

Discount factor (γ)
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "opportunistic" by only considering current rewards, while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the Q values may diverge.

Note: Convergence is tested on particular examples, in general convergence is not sure for above demo.

引用格式

Bhartendu (2022). SARSA Reinforcement Learning (https://www.mathworks.com/matlabcentral/fileexchange/63089-sarsa-reinforcement-learning), MATLAB Central File Exchange. 检索来源 .

MATLAB 版本兼容性
创建方式 R2016a
兼容任何版本
平台兼容性
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!