Main Content

Training and Simulation

Train and simulate reinforcement learning agents

During training, the agent continuously updates its parameters to learn the optimal policy for a given environment. During simulation, the agent receives observations and a reward from the environment, and returns an action to the environment without updating its parameters.

Reinforcement Learning Toolbox™ provides functions for training agents and validating the training results through simulation. For an introduction to training and simulating agents, see Train Reinforcement Learning Agents.

Apps

Reinforcement Learning DesignerDesign, train, and simulate reinforcement learning agents (Since R2021a)

Functions

expand all

trainTrain reinforcement learning agents within a specified environment
rlTrainingOptionsOptions for training reinforcement learning agents
rlMultiAgentTrainingOptionsOptions for training multiple reinforcement learning agents (Since R2022a)
trainWithEvolutionStrategyTrain DDPG, TD3 or SAC agent using an evolutionary strategy within a specified environment (Since R2023b)
rlEvolutionStrategyTrainingOptionsOptions for training off-policy reinforcement learning agents using an evolutionary strategy (Since R2023b)
showVisualize a training result object in a new Reinforcement Learning Training Monitor window (Since R2024a)
trainFromDataTrain off-policy reinforcement learning agent using existing data (Since R2023a)
rlTrainingFromDataOptionsOptions to train reinforcement learning agents using existing data (Since R2023a)
showVisualize a training result object in a new Reinforcement Learning Training Monitor window (Since R2024a)
rlEvaluatorOptions for evaluating reinforcement learning agents during training (Since R2023b)
rlCustomEvaluatorCustom object for evaluating reinforcement learning agents during training (Since R2023b)
rlDataLoggerCreate either a file logger object or a monitor logger object to log training data (Since R2022b)
rlDataViewerOpen Reinforcement Learning Data Viewer tool (Since R2023a)
FileLoggerLog reinforcement learning training data to MAT-files (Since R2022b)
MonitorLoggerLog reinforcement learning training data to monitor window (Since R2022b)
trainingProgressMonitorMonitor and plot training progress for deep learning custom training loops (Since R2022b)
setupSet up reinforcement learning environment or initialize data logger object (Since R2022a)
storeStore data in the internal memory of a (file or monitor) logger object (Since R2022b)
writeTransfer stored data from the internal logger memory to the logging target (Since R2022b)
cleanupClean up reinforcement learning environment or data logger object (Since R2022a)
simSimulate trained reinforcement learning agents within specified environment
rlSimulationOptionsOptions for simulating a reinforcement learning agent within an environment
rlReplayMemoryReplay memory experience buffer (Since R2022a)
rlPrioritizedReplayMemoryReplay memory experience buffer with prioritized sampling (Since R2022b)
rlHindsightReplayMemoryHindsight replay memory experience buffer (Since R2023a)
rlHindsightPrioritizedReplayMemoryHindsight replay memory experience buffer with prioritized sampling (Since R2023a)
appendAppend experiences to replay memory buffer (Since R2022a)
sampleSample experiences from replay memory buffer (Since R2022a)
resizeResize replay memory experience buffer (Since R2022b)
allExperiencesReturn all experiences in replay memory buffer (Since R2022b)
validateExperienceValidate experiences for replay memory (Since R2023a)
generateHindsightExperiencesGenerate hindsight experiences from hindsight experience replay buffer (Since R2023a)
rlOptimizerCreates an optimizer object for actors and critics (Since R2022a)
runEpisodeSimulate reinforcement learning environment against policy or agent (Since R2022a)
syncParametersModify the learnable parameters of one approximator towards the learnable parameters of another approximator (Since R2022a)
updateUpdate the state of on optimizer object and a set of learnable parameters using the gradient value (Since R2022a)
evaluateEvaluate function approximator object given observation (or observation-action) input data (Since R2022a)
setupSet up reinforcement learning environment or initialize data logger object (Since R2022a)
cleanupClean up reinforcement learning environment or data logger object (Since R2022a)
FutureObject that supports deferred outputs for reinforcement learning environment simulations running on workers (Since R2022a)
fetchNextRetrieve next available unread outputs from a reinforcement learning environment simulations running on workers (Since R2022a)
fetchOutputsRetrieve results from all reinforcement learning environment simulations running on workers (Since R2022a)
cancelCancel unfinished reinforcement learning environment simulations on workers (Since R2022a)
waitWait for reinforcement learning environment simulations running on a workers to finish (Since R2022a)
dlfevalEvaluate deep learning model for custom training loops
dlaccelerateAccelerate deep learning function for custom training loops (Since R2021a)
AcceleratedFunctionAccelerated deep learning function (Since R2021a)

Blocks

RL AgentReinforcement learning agent
PolicyReinforcement learning policy (Since R2022b)

Topics

Training and Simulation Basics

Use the Reinforcement Learning Designer App

Training and Simulation Advanced

Log Training Data and Tune Hyperparameters

Use Multiple Processes and GPUs

Multi-Agent Training

Develop Custom Agents and Training Algorithms

Train Model Based Policy Optimization Agents