MDP robot grid-world example

版本 1.0.0.0 (7.7 KB) 作者: Aaron T. Becker's Robot Swarm Lab

Applies value iteration to learn a policy for a robot in a grid world.

关注

5.0

(1)

811.0 次下载

更新时间 2015/11/24

查看许可证

Applies value iteration to learn a policy for a Markov Decision Process (MDP) -- a robot in a grid world.
The world is freespaces (0) or obstacles (1). Each turn the robot can move in 8 directions, or stay in place. A reward function gives one freespace, the goal location, a high reward. All other freespaces have a small penalty, and obstacles have a large negative reward. Value iteration is used to learn an optimal 'policy', a function that assigns a
control input to every possible location.
video at https://youtu.be/gThGerajccM

This function compares a deterministic robot, one that always executes movements perfectly, with a stochastic robot, that has a small probability of moving +/-45degrees from the commanded move. The optimal policy for a stochastic robot avoids narrow passages and tries to move to the center of corridors.

From Chapter 14 in 'Probabilistic Robotics', ISBN-13: 978-0262201629, http://www.probabilistic-robotics.org

Aaron Becker, March 11, 2015

引用格式

Aaron T. Becker's Robot Swarm Lab (2025). MDP robot grid-world example (https://ww2.mathworks.cn/matlabcentral/fileexchange/49992-mdp-robot-grid-world-example), MATLAB Central File Exchange. 检索时间: 2025/10/8.

MATLAB 版本兼容性

创建方式 R2014b

兼容任何版本

平台兼容性

Windows macOS Linux

类别

在 Help Center 和 MATLAB Answers 中查找有关 Robotics System Toolbox 的更多信息

标签添加标签

致谢

启发作品: Markov Decision Process (MDP) Algorithm, Kilobot Swarm Control using Matlab + Arduino

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

MDPgridworldExample()

版本	已发布	发行说明
1.0.0.0	2015/11/24	added link to video https://youtu.be/gThGerajccM	下载