getMaxQValue
Obtain maximum estimated value over all possible actions from a Q-value function critic with discrete action space, given environment observations
Since R2020a
Syntax
Description
[
evaluates the discrete-action-space Q-value function critic
maxQ
,maxActionIndex
] = getMaxQValue(qValueFcnObj
,obs
)qValueFcnObj
and returns the maximum estimated value over all
possible actions maxQ
, with the corresponding action index
maxActionIndex
, given environment observations
obs
.
[
also returns the updated state of maxQ
,maxActionIndex
,nextState
] = getMaxQValue(___)qValueFcnObj
when it contains a
recurrent neural network.
___ = getMaxQValue(___,UseForward=
allows you to explicitly call a forward pass when computing gradients.useForward
)
Examples
Input Arguments
Output Arguments
Tips
When the elements of the cell array in inData
are
dlarray
objects, the elements of the cell array returned in
outData
are also dlarray
objects. This allows
getMaxQValue
to be used with automatic differentiation.
Specifically, you can write a custom loss function that directly uses
getMaxQValue
and dlgradient
within
it, and then use dlfeval
and
dlaccelerate
with
your custom loss function. For an example, see Train Reinforcement Learning Policy Using Custom Training Loop and Custom Training Loop with Simulink Action Noise.
Version History
Introduced in R2020a