Training and Validation
To learn an optimal policy, a reinforcement learning agent interacts with the environment through a repeated trial-and-error process. During training, the agent tunes the parameters of its policy representation to maximize the long-term reward. Reinforcement Learning Toolbox™ software provides functions for training agents and validating the training results through simulation. For more information, see Train Reinforcement Learning Agents.
|Reinforcement Learning Designer||Design, train, and simulate reinforcement learning agents|
|Train reinforcement learning agents within a specified environment|
|Options for training reinforcement learning agents|
|Options for training multiple reinforcement learning agents|
|Plot training information from a previous training session|
|RL Agent||Reinforcement learning agent|
Training and Simulation Basics
- Train Reinforcement Learning Agents
Find the optimal policy by training your agent within a specified environment.
- Train Reinforcement Learning Agent in Basic Grid World
Train Q-learning and SARSA agents to solve a grid world in MATLAB®.
- Train Reinforcement Learning Agent in MDP Environment
Train a reinforcement learning agent in a generic Markov decision process environment.
- Create Simulink Environment and Train Agent
Train a controller using reinforcement learning with a plant modeled in Simulink® as the training environment.
Using the Reinforcement Learning Designer App
- Design and Train Agent Using Reinforcement Learning Designer
Design and train a DQN agent for a cart-pole system using the Reinforcement Learning Designer app.
- Specify Simulation Options in Reinforcement Learning Designer
Interactively specify options for simulating reinforcement learning agents.
- Specify Training Options in Reinforcement Learning Designer
Interactively specify options for training reinforcement learning agents.
Using Multiple Processes and GPUs
- Train Agents Using Parallel Computing and GPUs
Accelerate agent training by running simulations in parallel on multiple cores, GPUs, clusters or cloud resources.
- Train AC Agent to Balance Cart-Pole System Using Parallel Computing
Train actor-critic agent using asynchronous parallel computing.
- Train DQN Agent for Lane Keeping Assist Using Parallel Computing
Train a reinforcement learning agent for an automated driving application using parallel computing.
Train Agents in MATLAB Environments
- Train DDPG Agent to Control Double Integrator System
Train a deep deterministic policy gradient agent to control a second-order dynamic system modeled in MATLAB.
- Train PG Agent with Baseline to Control Double Integrator System
Train a policy gradient with a baseline to control a double integrator system modeled in MATLAB.
- Train DQN Agent to Balance Cart-Pole System
Train a deep Q-learning network agent to balance a cart-pole system modeled in MATLAB.
- Train PG Agent to Balance Cart-Pole System
Train a policy gradient agent to balance a cart-pole system modeled in MATLAB.
- Train AC Agent to Balance Cart-Pole System
Train an actor-critic agent to balance a cart-pole system modeled in MATLAB.
- Train DDPG Agent to Swing Up and Balance Pendulum with Image Observation
Train a reinforcement learning agent using an image-based observation signal.
- Create Agent Using Deep Network Designer and Train Using Image Observations
Create a reinforcement learning agent using the Deep Network Designer app from the Deep Learning Toolbox™.
Train Agents in Simulink Environments
- Train DQN Agent to Swing Up and Balance Pendulum
Train a Deep Q-network agent to balance a pendulum modeled in Simulink.
- Train DDPG Agent to Swing Up and Balance Pendulum
Train a deep deterministic policy gradient agent to balance a pendulum modeled in Simulink.
- Train DDPG Agent to Swing Up and Balance Pendulum with Bus Signal
Train a reinforcement learning agent to balance a pendulum Simulink model that contains observations in a bus signal.
- Train DDPG Agent to Swing Up and Balance Cart-Pole System
Train a deep deterministic policy gradient agent to swing up and balance a cart-pole system modeled in Simscape™ Multibody™.
- Train Multiple Agents to Perform Collaborative Task
Train two PPO agents to collaboratively move an object.
- Train Multiple Agents for Area Coverage
Train three PPO agents to explore a grid-world environment in a collaborative-competitive manner.
- Train Multiple Agents for Path Following Control
Train a DQN and a DDPG agent to collaboratively perform adaptive cruise control and lane keeping assist to follow a path.
Generate Rewards from Control Specifications
- Generate Reward Function from a Model Predictive Controller for a Servomotor
Generate a reward function from an MPC controller applied to a servomotor.
- Generate Reward Function from a Model Verification Block for a Water Tank System
Generate a reward function from an model verification block applied to a water tank system.
- Imitate MPC Controller for Lane Keeping Assist
Train a deep neural network to imitate the behavior of a model predictive controller within a lane keeping assist system.
- Imitate Nonlinear MPC Controller for Flying Robot
Train a deep neural network to imitate the behavior of a nonlinear model predictive controller for a flying robot.
- Train DDPG Agent with Pretrained Actor Network
Train a reinforcement learning agent using an actor network that has been previously trained using supervised learning.
Custom Agents and Training Algorithms
- Train Custom LQR Agent
Train a custom LQR agent.
- Train Reinforcement Learning Policy Using Custom Training Loop
Train a reinforcement learning policy using your own custom training algorithm.
- Custom Training Loop with Simulink Action Noise
Use a custom training loop to train a reinforcement learning policy in Simulink when action noise is generated within the model.
- Create Agent for Custom Reinforcement Learning Algorithm
Create agent for custom reinforcement learning algorithm.