DQN Control for Inverted Pendulum with Reinforcement Learning Toolbox
Use the Deep Q-Network (DQN) algorithm in Reinforcement Learning Toolbox™ to1) create the environment, 2) create DQN agent, 3) customize policy representation, 4) train DQN agent, 5) verify trained policy, and 6) deploy trained policy with code generation.
The provided pendulum environment has predefined observations, actions, and reward. The actions include five possible torque values, while the observations include a 50x50 grayscale image as well as the angular rate of the pendulum, and the reward is the distance from the desired upward position. See how the default DQN agent feature automatically constructs a neural network representation of the Q-function, used by the DQN agent to approximate long-term reward. Learn how to use Deep Network Designer app to graphically customize the generated Q-function representation.
See how you can visualize the pendulum behavior and logged data during training, and monitor training progress. After training is complete, verify the policy in simulation to decide if further training is necessary. If you are happy with the design, deploy the trained policy using automatic code generation.
Published: 20 Sep 2023