rlContinuousGaussianActor
Stochastic Gaussian actor with a continuous action space for reinforcement learning agents
Since R2022a
Description
This object implements a function approximator to be used as a stochastic actor
within a reinforcement learning agent with a continuous action space. A continuous Gaussian
actor takes an environment observation as input and returns as output a random action sampled
from a parametrized Gaussian probability distribution, thereby implementing a parametrized
stochastic policy. After you create an rlContinuousGaussianActor
object, use
it to create a suitable agent, such as an rlACAgent
or rlPGAgent
agent. For
more information on creating actors and critics, see Create Policies and Value Functions.
Creation
Syntax
Description
creates a Gaussian stochastic actor with a continuous action space using the deep neural
network actor
= rlContinuousGaussianActor(net
,observationInfo
,actionInfo
,ActionMeanOutputNames=meanOutLyrName
,ActionStandardDeviationOutputNames=stdOutLyrName
)net
as approximation model. Here, net
must have two differently named output layers, each with as many elements as the number of
dimensions of the action space, as specified in actionInfo
. The two
output layers must return the mean and standard deviation of each component of the action,
respectively. The actor uses the output of these two layers, according to the names
specified in the strings meanOutLyrName
and
stdOutLyrName
, to represent the Gaussian probability distribution
from which the action is sampled. This syntax sets the
ObservationInfo
and ActionInfo
properties of
actor
to the input arguments observationInfo
and actionInfo
, respectively.
Note
actor
does not enforce constraints set by the action
specification. When using this actor in a different agent than SAC, you must enforce
action space constraints within the environment.
specifies the names of the observation input layers or sets the actor
= rlContinuousGaussianActor(___,Name=Value
)UseDevice
property using one or more name-value arguments. Use
this syntax with any of the input argument combinations in the preceding syntax. Specify
the input layer names to explicitly associate the layers of your network with specific
environment channels. To specify the device where computations for
actor
are executed, set the UseDevice
property, for
example UseDevice="gpu"
.
Input Arguments
Properties
Object Functions
rlACAgent | Actor-critic (AC) reinforcement learning agent |
rlPGAgent | Policy gradient (PG) reinforcement learning agent |
rlPPOAgent | Proximal policy optimization (PPO) reinforcement learning agent |
rlSACAgent | Soft actor-critic (SAC) reinforcement learning agent |
getAction | Obtain action from agent, actor, or policy object given environment observations |
evaluate | Evaluate function approximator object given observation (or observation-action) input data |
gradient | (Not recommended) Evaluate gradient of function approximator object given observation and action input data |
accelerate | (Not recommended) Option to accelerate computation of gradient for approximator object based on neural network |
getLearnableParameters | Obtain learnable parameter values from agent, function approximator, or policy object |
setLearnableParameters | Set learnable parameter values of agent, function approximator, or policy object |
setModel | Set approximation model in function approximator object |
getModel | Get approximation model from function approximator object |
Examples
Version History
Introduced in R2022a