Specify Training Options in Reinforcement Learning Designer
To configure the training of an agent in the Reinforcement Learning Designer app, specify training options on the Train tab.
Specify Basic Options
On the Train tab, you can specify the following basic training options.
Option | Description |
---|---|
Max Episodes | Maximum number of episodes to train the agent, specified as a positive integer. |
Max Episode Length | Maximum number of steps to run per episode, specified as a positive integer. |
Stopping Criteria | Training termination condition, specified as one of the following values.
|
Stopping Value | Critical value of the training termination condition in Stopping Criteria, specified as a scalar. |
Average Window Length | Window length for averaging the scores, rewards, and number of steps for the agent when either Stopping Criteria or Save agent criteria specify an averaging condition. |
Specify Agent Evaluation Options
To enable agent evaluation at regular intervals during training, on the Train tab, click .
To specify agent evaluation options, select Evaluate Agent > Agent evaluation options.
In the Agent Evaluation Options dialog box, you can specify the following training options.
Option | Description |
---|---|
Enable agent evaluation | Enables periodic agent evaluation during training. This option gets selected also when you click . |
Number of evaluation episodes | Number of consecutive evaluation episodes, specified as a positive integer. After the number of consecutive training episodes specified in Evaluation frequency, the software runs the number of evaluation episodes specified in this field, consecutively. For
example, if you specify |
Evaluation frequency | Evaluation period, specified as a positive integer. It is the number of
consecutive training episodes after which the number of consecutive evaluation
episodes specified in the Number of evaluation episodes field
are run. For example, if you specify |
Max evaluation episode length | Maximum number of steps to run for an evaluation episode, specified as a positive integer. This value is the maximum number of steps to run for an evaluation episode if other termination conditions are not met before. To accurately assess the agent stability and performance, it is often useful to specify a larger number of steps for an evaluation episode, with respect to a training episode. If you leave this field empty (default), the maximum number of steps per episode specified in the Max Episode Length field is used. |
Evaluation random seeds | Random seeds used for evaluation episodes, specified as one of the following.
The current random seed used for training is stored before the first episode of an evaluation sequence and reset as the current seed after the evaluation sequence. This ensures that the training results with evaluation are the same as the results without evaluation. |
Evaluation statistic type | Type of evaluation statistic for each group of consecutive evaluation episodes, specified as one of these strings:
This value is returned, in the training result object, as the
element of the |
Use exploration policy | Option to use exploration policy during evaluation episodes. When this option is disabled (default) the agent uses its base greedy policy when selecting actions during an evaluation episode. When you enable this option, the agent uses its base exploration policy when selecting actions during an evaluation episode. |
For more information on evaluation options, see rlEvaluator
.
Specify Parallel Training Options
To enable the use of multiple processes for training, on the Train tab, click . Training agents using parallel computing requires Parallel Computing Toolbox™ software. For more information, see Train Agents Using Parallel Computing and GPUs.
To specify options for parallel training, select Use Parallel > Parallel training options.
In the Parallel Training Options dialog box, you can specify the following training options.
Option | Description |
---|---|
Enable parallel training | Enables using multiple processes to perform environment simulations during training. This option gets selected also when you click . |
Parallel computing mode | Parallel computing mode, specified as one of the following values.
|
Transfer workspace variables to workers | Select this option to send model and workspace variables to parallel workers. When you select this option, the parallel pool client (the process that starts the training) sends variables used in models and defined in the MATLAB® workspace to the workers. |
Random seed for workers | Randomizer initialization for workers, specified as one of the following values.
|
Files to attach to parallel pool | Additional files to attach to the parallel pool. Specify names of files in the current working directory, with one name on each line. |
Worker setup function | Function to run before training starts, specified as a handle to a function having no input arguments. This function is run once per worker before training begins. Write this function to perform any processing that you need prior to training. |
Worker cleanup function | Function to run after training ends, specified as a handle to a function having no input arguments. You can write this function to clean up the workspace or perform other processing after training terminates. |
The following figure shows an example parallel training configuration for the following files and functions.
Data file attached to the parallel pool —
workerData.mat
Worker setup function —
mySetup.m
Worker cleanup function —
myCleanup.m
For more information on parallel training options, see the
UseParallel
and ParallelizationOptions
properties in rlTrainingOptions
.
For more information on parallel training, see Train Agents Using Parallel Computing and GPUs.
Specify Additional Options
To specify additional training options, on the Train tab, click More Options.
In the More Training Options dialog box, you can specify the following options.
Option | Description |
---|---|
Save agent criteria | Condition for saving agents during training, specified as one of the following values.
|
Save agent value | Critical value of the save agent condition in Save agent
criteria, specified as a scalar or "none" . |
Save directory | Folder for saved agents. If you specify a name and the folder does not exist, the app creates the folder in the current working directory. To interactively select a folder, click Browse. |
Show verbose output | Select this option to display training progress at the command line. |
Stop on Error | Select this option to stop training when an error occurs during an episode. |
For more information training options, see rlTrainingOptions
.
See Also
Apps
Functions
Objects
Related Examples
- Design and Train Agent Using Reinforcement Learning Designer
- Specify Simulation Options in Reinforcement Learning Designer