gru
Syntax
Description
The gated recurrent unit (GRU) operation allows a network to learn dependencies between time steps in time series and sequence data.
applies a gated recurrent unit (GRU) calculation to input Y
= gru(X
,H0
,weights
,recurrentWeights
,bias
)X
using the
initial hidden state H0
, and parameters weights
,
recurrentWeights
, and bias
. The input
X
must be a formatted dlarray
. The output
Y
is a formatted dlarray
with the same dimension
format as X
, except for any "S"
dimensions.
The gru
function updates the hidden state using the hyperbolic
tangent function (tanh) as the state activation function. The gru
function uses the sigmoid function given by as the gate activation function.
[
also returns the hidden state after the GRU operation.Y
,hiddenState
] = gru(X
,H0
,weights
,recurrentWeights
,bias
)
___ = gru(
also specifies the dimension format X
,H0
,weights
,recurrentWeights
,bias
,DataFormat=FMT)FMT
when X
is not
a formatted dlarray
. The output Y
is an unformatted
dlarray
with the same dimension order as X
, except
for any "S"
dimensions.
___ = gru(
specifies additional options using one or more name-value arguments.X
,H0
,weights
,recurrentWeights
,bias
,Name=Value)
Examples
Input Arguments
Output Arguments
More About
References
[1] Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).
Extended Capabilities
Version History
Introduced in R2020aSee Also
dlarray
| fullyconnect
| softmax
| dlgradient
| dlfeval
| lstm
| attention