Main Content

geluLayer

Gaussian error linear unit (GELU) layer

Since R2022b

    Description

    A Gaussian error linear unit (GELU) layer weights the input by its probability under a Gaussian distribution.

    This operation is given by

    GELU(x)=x2(1+erf(x2)),

    where erf denotes the error function.

    Creation

    Description

    layer = geluLayer returns a GELU layer.

    example

    layer = geluLayer(Name=Value) sets the optional Approximation and Name properties using name-value arguments. For example, geluLayer(Name="gelu") creates a GELU layer with the name "gelu".

    Properties

    expand all

    GELU

    Approximation method for the GELU operation, specified as one of these values:

    • 'none' — Do not use approximation.

    • 'tanh' — Approximate the underlying error function using

      erf(x2)tanh(2π(x+0.044715x3)).

    Tip

    In MATLAB®, computing the tanh approximation is typically less accurate, and, for large input sizes, slower than computing the GELU activation without using an approximation. Use the tanh approximation when you want to reproduce models that use this approximation, such as BERT and GPT-2.

    Layer

    Layer name, specified as a character vector or string scalar. For Layer array input, the trainnet and dlnetwork functions automatically assign names to layers with the name "".

    The GELULayer object stores this property as a character vector.

    Data Types: char | string

    This property is read-only.

    Number of inputs to the layer, returned as 1. This layer accepts a single input only.

    Data Types: double

    This property is read-only.

    Input names, returned as {'in'}. This layer accepts a single input only.

    Data Types: cell

    This property is read-only.

    Number of outputs from the layer, returned as 1. This layer has a single output only.

    Data Types: double

    This property is read-only.

    Output names, returned as {'out'}. This layer has a single output only.

    Data Types: cell

    Examples

    collapse all

    Create a GELU layer.

    layer = geluLayer
    layer = 
      GELULayer with properties:
    
                 Name: ''
    
       Hyperparameters
        Approximation: 'none'
    
    

    Include a GELU layer in a Layer array.

    layers = [
        imageInputLayer([28 28 1])
        convolution2dLayer(5,20)
        geluLayer
        maxPooling2dLayer(2,Stride=2)
        fullyConnectedLayer(10)
        softmaxLayer]
    layers = 
      6x1 Layer array with layers:
    
         1   ''   Image Input       28x28x1 images with 'zerocenter' normalization
         2   ''   2-D Convolution   20 5x5 convolutions with stride [1  1] and padding [0  0  0  0]
         3   ''   GELU              GELU
         4   ''   2-D Max Pooling   2x2 max pooling with stride [2  2] and padding [0  0  0  0]
         5   ''   Fully Connected   10 fully connected layer
         6   ''   Softmax           softmax
    

    Algorithms

    expand all

    References

    [1] Hendrycks, Dan, and Kevin Gimpel. "Gaussian error linear units (GELUs)." Preprint, submitted June 27, 2016. https://arxiv.org/abs/1606.08415

    Extended Capabilities

    C/C++ Code Generation
    Generate C and C++ code using MATLAB® Coder™.

    GPU Code Generation
    Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

    Version History

    Introduced in R2022b

    expand all