Emmanouil Tzorakoleftherakis, MathWorks
Use Reinforcement Learning Toolbox™ and the DQN algorithm to perform image-based inversion of a simple pendulum. The workflow consists of the following steps: 1) Create the environment, 2) specify policy representation, 3) create agent, 4) train agent, and 5) verify trained policy.
The provided pendulum environment has predefined observations, actions, and reward. The actions include five possible torque values, the observations include a 50x50 grayscale image as well as the angular rate of the pendulum, and the reward is the distance from the desired upward position. Learn how to use Deep Network Designer app to construct a neural network representation of the Q-function, used by the DQN agent to approximate long-term reward.
See how you can visualize the pendulum behavior during training, and monitor training progress. After training is complete, verify the policy in simulation to decide if further training is necessary.