layernorm
Normalize data across all channels for each observation independently
Since R2021a
Syntax
Description
The layer normalization operation normalizes the input data across all channels for each observation independently. To speed up training of recurrent and multilayer perceptron neural networks and reduce the sensitivity to network initialization, use layer normalization after the learnable operations, such as LSTM and fully connect operations.
After normalization, the operation shifts the input by a learnable offset β and scales it by a learnable scale factor γ.
The layernorm
function applies the layer normalization operation to
dlarray
data.
Using dlarray
objects makes working with high
dimensional data easier by allowing you to label the dimensions. For example, you can label
which dimensions correspond to spatial, time, channel, and batch dimensions using the
"S"
, "T"
, "C"
, and
"B"
labels, respectively. For unspecified and other dimensions, use the
"U"
label. For dlarray
object functions that operate
over particular dimensions, you can specify the dimension labels by formatting the
dlarray
object directly, or by using the DataFormat
option.
Note
To apply layer normalization within a dlnetwork
object, use layerNormalizationLayer
.
applies the layer normalization operation to the input data Y
= layernorm(X
,offset
,scaleFactor
)X
and
transforms it using the specified offset and scale factor.
The function normalizes over the 'S'
(spatial),
'T'
(time), 'C'
(channel), and
'U'
(unspecified) dimensions of X
for each
observation in the 'B'
(batch) dimension, independently.
For unformatted input data, use the 'DataFormat'
option.
applies the layer normalization operation to the unformatted Y
= layernorm(X
,offset
,scaleFactor
,'DataFormat',FMT)dlarray
object
X
with the format specified by FMT
. The output
Y
is an unformatted dlarray
object with dimensions
in the same order as X
. For example,
'DataFormat','SSCB'
specifies data for 2-D image input with the format
'SSCB'
(spatial, spatial, channel, batch).
To specify the format of the scale and offset, use the
'ScaleFormat'
and 'OffsetFormat'
options,
respectively.
specifies options using one or more name-value pair arguments in addition to the input
arguments in previous syntaxes. For example, Y
= layernorm(___,Name,Value
)'Epsilon',1e-4
sets the
epsilon value to 1e-4
.
Examples
Input Arguments
Output Arguments
Algorithms
References
[1] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. “Layer Normalization.” Preprint, submitted July 21, 2016. https://arxiv.org/abs/1607.06450.
Extended Capabilities
Version History
Introduced in R2021aSee Also
relu
| fullyconnect
| dlconv
| dlarray
| dlgradient
| dlfeval
| groupnorm
| batchnorm