# rlDiscreteCategoricalActor

Stochastic categorical actor with a discrete action space for reinforcement learning agents

*Since R2022a*

## Description

This object implements a function approximator to be used as a stochastic actor
within a reinforcement learning agent with a discrete action space. A discrete categorical
actor takes an environment observation as input and returns as output a random action sampled
from a categorical (also known as Multinoulli) probability distribution, thereby implementing
a parametrized stochastic policy. After you create an
`rlDiscreteCategoricalActor`

object, use it to create a suitable agent, such
as `rlACAgent`

or `rlPGAgent`

. For more
information on creating representations, see Create Policies and Value Functions.

## Creation

### Syntax

### Description

creates a stochastic actor with a discrete action space, using the deep neural network
`actor`

= rlDiscreteCategoricalActor(`net`

,`observationInfo`

,`actionInfo`

)`net`

as underlying approximation model. For this actor,
`actionInfo`

must specify a discrete action space. The network
input layers are automatically associated with the environment observation channels
according to the dimension specifications in `observationInfo`

. The
network must have a single output layer with as many elements as the number of possible
discrete actions, as specified in `actionInfo`

. This function sets
the `ObservationInfo`

and `ActionInfo`

properties
of `actor`

to the inputs `observationInfo`

and
`actionInfo`

, respectively.

specifies the names of the network input layers to be associated with the environment
observation channels. The function assigns, in sequential order, each environment
observation channel specified in `actor`

= rlDiscreteCategoricalActor(`net`

,`observationInfo`

,`actionInfo`

,ObservationInputNames=`netObsNames`

)`observationInfo`

to the layer
specified by the corresponding name in the string array
`netObsNames`

. Therefore, the network input layers, ordered as the
names in `netObsNames`

, must have the same data type and dimensions
as the observation channels, as ordered in `observationInfo`

.

creates a discrete space stochastic actor using a custom basis function as underlying
approximation model. The first input argument is a two-element cell array whose first
element is the handle `actor`

= rlDiscreteCategoricalActor({`basisFcn`

,`W0`

},`observationInfo`

,`actionInfo`

)`basisFcn`

to a custom basis function and whose
second element is the initial weight matrix `W0`

. This function sets
the `ObservationInfo`

and `ActionInfo`

properties
of `actor`

to the inputs `observationInfo`

and
`actionInfo`

, respectively.

specifies the device used to perform computational operations on the
`actor`

= rlDiscreteCategoricalActor(___,UseDevice=`useDevice`

)`actor`

object, and sets the `UseDevice`

property of `actor`

to the `useDevice`

input
argument. You can use this syntax with any of the previous input-argument
combinations.

### Input Arguments

## Properties

## Object Functions

`rlACAgent` | Actor-critic (AC) reinforcement learning agent |

`rlPGAgent` | Policy gradient (PG) reinforcement learning agent |

`rlPPOAgent` | Proximal policy optimization (PPO) reinforcement learning agent |

`getAction` | Obtain action from agent, actor, or policy object given environment observations |

`evaluate` | Evaluate function approximator object given observation (or observation-action) input data |

`gradient` | Evaluate gradient of function approximator object given observation and action input data |

`accelerate` | Option to accelerate computation of gradient for approximator object based on neural network |

`getLearnableParameters` | Obtain learnable parameter values from agent, function approximator, or policy object |

`setLearnableParameters` | Set learnable parameter values of agent, function approximator, or policy object |

`setModel` | Set approximation model in function approximator object |

`getModel` | Get approximation model from function approximator object |

## Examples

## Version History

**Introduced in R2022a**