Can I use rlfinitesetspec for a multi-observation system where the number of combinations is arbitrarily large?

Question

Alvin Allen 2021-11-2

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/1577705-can-i-use-rlfinitesetspec-for-a-multi-observation-system-where-the-number-of-combinations-is-arbitra

I am attempting to solve a reinforcement learning problem. My environment consists of a 4x4 grid game board, so 16 entries, where each square can have one of four states, 0, 1, 2, or 3. The position of the board is used as the observation, so an example board might be

0 2 0 1

1 3 1 2

0 1 2 3

3 1 2 0

and then that would get converted into an observation as [0 2 0 1 1 3 1 2 0 1 2 3 3 1 2 0]. I can't tell from the documentation of rlnumericspec and rlfinitsetspec if this kind of system should be treated as discrete-observation or if I am required to use continuous-observation. The continuous implementation is intuitive to me, I can use rlnumericspec(16, 1). However, intuitively I would expect that when possible it is better to use a discrete specification to fix the number of states that need to be considered, remove the need for interpolation, etc.

In my case, because the possible states of each grid position is known and fixed it would be possible to write out an array of all possible board positions but if my combinatorics is right there are 4^16 possible board states which is...a lot. From my reading of the documentation of rlfinitesetspec, in order to use a discrete specification I would need to write out a cell array containing each possible board state. If that's correct, what would be the easiest way to generate that array? I'm aware of perms() but I don't think it's applicable here. If I'm not correct, I can't tell from the examples in the documentation if this kind of observation space can (or indeed should) be expressed using rlfinitesetspec, and that's what I'd like to confirm with this question.

Thanks in advance

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Can I use rlfinitesetspec for a multi-observation system where the number of combinations is arbitrarily large?

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Can I use rlfinitesetspec for a multi-observation system where the number of combinations is arbitrarily large?

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

回答（0 个）

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论