# rlVectorQValueFunction

Vector Q-value function approximator for reinforcement learning agents

*Since R2022a*

## Description

This object implements a vector Q-value function approximator that you can use as
a critic with a discrete action space for a reinforcement learning agent. A vector Q-value
function (also known as vector action-value function) is a mapping from an environment
observation to a vector in which each element represents the expected discounted cumulative
long-term reward when an agent starts from the state corresponding to the given observation
and executes the action corresponding to the element number (and follows a given policy
afterwards). A vector Q-value function critic therefore needs only the observation as input.
After you create an `rlVectorQValueFunction`

critic, use it to create an agent
such as `rlQAgent`

, `rlDQNAgent`

, `rlSARSAAgent`

. For more
information on creating actors and critics, see Create Policies and Value Functions.

## Creation

### Syntax

### Description

creates the `critic`

= rlVectorQValueFunction(`net`

,`observationInfo`

,`actionInfo`

)*multi-output* Q-value function
`critic`

with a *discrete action space*. Here,
`net`

is the deep neural network used as an approximation model,
and must have only the observations as input and a single output layer having as many
elements as the number of possible discrete actions. The network input layers are
automatically associated with the environment observation channels according to the
dimension specifications in `observationInfo`

. This function sets the
`ObservationInfo`

and `ActionInfo`

properties of
`critic`

to the `observationInfo`

and
`actionInfo`

input arguments, respectively.

creates the `critic`

= rlVectorQValueFunction({`basisFcn`

,`W0`

},`observationInfo`

,`actionInfo`

)*multi-output* Q-value function
`critic`

with a *discrete action space* using a
custom basis function as underlying approximation model. The first input argument is a
two-element cell array whose first element is the handle `basisFcn`

to a custom basis function and whose second element is the initial weight matrix
`W0`

. Here the basis function must have only the observations as
inputs, and `W0`

must have as many columns as the number of possible
actions. The function sets the ObservationInfo
and ActionInfo
properties of `critic`

to the input arguments
`observationInfo`

and `actionInfo`

,
respectively.

specifies names of the observation input layers (for network-based approximators) or
sets the `critic`

= rlVectorQValueFunction(___,`Name=Value`

)`UseDevice`

property using one or more name-value arguments.
Specifying the input layer names allows you explicitly associate the layers of your
network approximator with specific environment channels. For all types of approximators,
you can specify the device where computations for `critic`

are
executed, for example `UseDevice="gpu"`

.

### Input Arguments

## Properties

## Object Functions

`rlDQNAgent` | Deep Q-network (DQN) reinforcement learning agent |

`rlQAgent` | Q-learning reinforcement learning agent |

`rlSARSAAgent` | SARSA reinforcement learning agent |

`getValue` | Obtain estimated value from a critic given environment observations and actions |

`getMaxQValue` | Obtain maximum estimated value over all possible actions from a Q-value function critic with discrete action space, given environment observations |

`evaluate` | Evaluate function approximator object given observation (or observation-action) input data |

`getLearnableParameters` | Obtain learnable parameter values from agent, function approximator, or policy object |

`setLearnableParameters` | Set learnable parameter values of agent, function approximator, or policy object |

`setModel` | Set approximation model in function approximator object |

`getModel` | Get approximation model from function approximator object |

## Examples

## Version History

**Introduced in R2022a**