Q-learning movement restriction

Question

0 个投票

I am Implementing Q learning on 5 x 5 grid using the following code.

%% Create Grid
GW = createGridWorld(5,5);
GW.CurrentState = '[1,1]';
GW.TerminalStates = '[5,1]'
GW.ObstacleStates = ["[3,1]";"[3,2]";"[3,3]"];
updateStateTranstionForObstacles(GW)
GW.T(state2idx(GW,"[2,2]"),:,:) = 0;
GW.T(state2idx(GW,"[2,2]"),state2idx(GW,"[4,2]"),:) = 1;
  

It creates a 5 x 5 grid starts from [1,1] and termnates at [5,1]. I want the starting point random in this 5 x 5 grid and it only move around its 8 neighbours and terminate.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Follow Question

Answer 1

Sachin Lodhi 2024-4-29

在 MATLAB Online 中打开

0 个投票

Hello Raza,

To modify the Q-learning implementation for a 5x5 grid in Matlab so that the starting point is random and the agent can only move to one of its 8 neighbors (including diagonally), you will need to adjust the grid world settings and the action space to reflect these requirements. The Matlab Reinforcement Learning Toolbox provides tools for creating and manipulating grid worlds, but handling diagonal movements and random starting points requires some customization.

Here is an approach to achieve your objectives:

Randomize the Starting State: You can randomize the starting state by setting the "GW.CurrentState" property to a random cell that is not an obstacle or the terminal state.
Define the Action Space for 8 Neighbors: By default, the grid world actions are limited to up, down, left, and right. To include diagonal movements, you'll need to customize the transition probabilities (GW.T) to allow moving to any of the 8 surrounding cells.

Since direct support for diagonal movements isn't provided out of the box, you'll have to manually adjust the transition probabilities to simulate this behavior.

Here's how you can adjust your code:

%% Create Grid
GW = createGridWorld(5,5);
% Randomize starting state, avoiding obstacles and the terminal state
validStartStates = setdiff(1:25, [state2idx(GW, "[5,1]"); state2idx(GW, "[3,1]"); state2idx(GW, "[3,2]"); state2idx(GW, "[3,3]")]);
randomStartStateIdx = validStartStates(randi(length(validStartStates)));
GW.CurrentState = idx2state(GW, randomStartStateIdx);
GW.TerminalStates = '[5,1]';
GW.ObstacleStates = ["[3,1]";"[3,2]";"[3,3]"];
updateStateTranstionForObstacles(GW);

Implementing full 8-directional movement in a grid world environment with custom starting points, especially considering obstacles and edges, requires significant customization of the grid world's transition dynamics. The following Matlab documentation on Reinforcement Learning Toolbox resources can provide additional guidance on manipulating the transition probabilities and state space to achieve your desired behavior -

https://www.mathworks.com/help/reinforcement-learning/ref/creategridworld.html

I hope this helps!

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Q-learning movement restriction

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

类别

产品

版本

标签

Community Treasure Hunt

Q-learning movement restriction

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

回答（1 个）

0 个评论 显示 -2更早的评论 隐藏 -2更早的评论

类别

产品

版本

标签

另请参阅

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论