Q-learning movement restriction

3 次查看(过去 30 天)
Raza Ali
Raza Ali 2021-7-27
I am Implementing Q learning on 5 x 5 grid using the following code.
%% Create Grid
GW = createGridWorld(5,5);
GW.CurrentState = '[1,1]';
GW.TerminalStates = '[5,1]'
GW.ObstacleStates = ["[3,1]";"[3,2]";"[3,3]"];
updateStateTranstionForObstacles(GW)
GW.T(state2idx(GW,"[2,2]"),:,:) = 0;
GW.T(state2idx(GW,"[2,2]"),state2idx(GW,"[4,2]"),:) = 1;
It creates a 5 x 5 grid starts from [1,1] and termnates at [5,1]. I want the starting point random in this 5 x 5 grid and it only move around its 8 neighbours and terminate.

回答(1 个)

Sachin Lodhi
Sachin Lodhi 2024-4-29
Hello Raza,
To modify the Q-learning implementation for a 5x5 grid in Matlab so that the starting point is random and the agent can only move to one of its 8 neighbors (including diagonally), you will need to adjust the grid world settings and the action space to reflect these requirements. The Matlab Reinforcement Learning Toolbox provides tools for creating and manipulating grid worlds, but handling diagonal movements and random starting points requires some customization.
Here is an approach to achieve your objectives:
  1. Randomize the Starting State: You can randomize the starting state by setting the "GW.CurrentState" property to a random cell that is not an obstacle or the terminal state.
  2. Define the Action Space for 8 Neighbors: By default, the grid world actions are limited to up, down, left, and right. To include diagonal movements, you'll need to customize the transition probabilities (GW.T) to allow moving to any of the 8 surrounding cells.
Since direct support for diagonal movements isn't provided out of the box, you'll have to manually adjust the transition probabilities to simulate this behavior.
Here's how you can adjust your code:
%% Create Grid
GW = createGridWorld(5,5);
% Randomize starting state, avoiding obstacles and the terminal state
validStartStates = setdiff(1:25, [state2idx(GW, "[5,1]"); state2idx(GW, "[3,1]"); state2idx(GW, "[3,2]"); state2idx(GW, "[3,3]")]);
randomStartStateIdx = validStartStates(randi(length(validStartStates)));
GW.CurrentState = idx2state(GW, randomStartStateIdx);
GW.TerminalStates = '[5,1]';
GW.ObstacleStates = ["[3,1]";"[3,2]";"[3,3]"];
updateStateTranstionForObstacles(GW);
Implementing full 8-directional movement in a grid world environment with custom starting points, especially considering obstacles and edges, requires significant customization of the grid world's transition dynamics. The following Matlab documentation on Reinforcement Learning Toolbox resources can provide additional guidance on manipulating the transition probabilities and state space to achieve your desired behavior -
I hope this helps!

类别

Help CenterFile Exchange 中查找有关 Statistics and Machine Learning Toolbox 的更多信息

产品


版本

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by