getValue
Obtain estimated value from a critic given environment observations and actions
Syntax
Description
Value Function Critic
evaluates the value function critic value = getValue(valueFcnAppx,obs)valueFcnAppx and returns the
value corresponding to the observation obs. In this case,
valueFcnAppx is an rlValueFunction
approximator object.
Q-Value Function Critics
evaluates the discrete-action-space Q-value function critic
value = getValue(vqValueFcnAppx,obs)vqValueFcnAppx and returns the vector value,
in which each element represents the estimated value given the state corresponding to the
observation obs and the action corresponding to the element number of
value. In this case, vqValueFcnAppx is an
rlVectorQValueFunction approximator object.
evaluates the Q-value function critic value = getValue(qValueFcnAppx,obs,act)qValueFcnAppx and returns the
scalar value, representing the value given the observation
obs and action act. In this case,
qValueFcnAppx is an rlQValueFunction
approximator object.
Return Recurrent Neural Network State
Use Forward
___ = getValue(___,UseForward=
allows you to explicitly call a forward pass when computing gradients.useForward)
Examples
Input Arguments
Output Arguments
Tips
The more general function
evaluatebehaves, for critic objects, similarly togetValueexcept thatevaluatereturns results inside a single-cell array.When the elements of the cell array in
inDataaredlarrayobjects, the elements of the cell array returned inoutDataare alsodlarrayobjects. This allowsgetValueto be used with automatic differentiation.Specifically, you can write a custom loss function that directly uses
getValueanddlgradientwithin it, and then usedlfevalanddlacceleratewith your custom loss function. For an example, see Train Reinforcement Learning Policy Using Custom Training Loop and Custom Training Loop with Simulink Action Noise.
Version History
Introduced in R2020a