Reinforcement learning: learning a game

Anne Tscheliessnig

2020 7 27

0 个回答

10 次查看（30 天）

0 个投票

Hi,

I want to learn RI by programming a game (e.g. TicTacToe) which requires 2 players. To get the action of the second player into the RI I would adapt my step function:

1) First get the initial observation and pass it on to the step function ("logged.signal")

Note: to add additional randomness, I would randomly pick Player 1 (= Agent) or Player 2 to start in the ResetFunction. If Player 2 starts, there would be a first random action performed by Player 2 before the ResetFunction ends.

----Start of StepFunction

2) Take a random action and update the observation ("nextobs")

3) Check for "IsDone" and "Reward"

4) Take another random action ( = Player 2) and update the observation ("logged.signal")

5) Check for "IsDone" and "Reward"

----- End of StepFunction

Does that sound feasible?