How Deep Learning HDL Toolbox Compiles the LSTM Layer

An LSTM is a type of recurrent neural network (RNN) that can learn long-term dependencies between time steps of sequence data. When you compile LSTM layers, Deep Learning HDL Toolbox™ splits the LSTM layer into components, generates instructions and memory offsets for those components. Integrate a deep learning processor IP core with LSTM layers into your reference design by:

Learning about the compile function generated LSTM layer components and how those components are optimized.
Identifying the external memory addresses that store the generated LSTM layer components weights, biases, and instructions.

LSTM Layer Architecture

The LSTM layer uses a gating mechanism that controls the memorizing process. You can store, write, or read information in LSTMs by using gates that open and close. An LSTM layer consists of these components:

Forget gate — The forget gate,f decides which information to remember and which information to forget.
Input gate — The input gate, i updates the cell state using information from the input current state x and the previous hidden state h.
Cell state — The cell state stores information from the new layer state based on the previous cell state, c. The current cell state is, g.
Output gate — The output gate, o determines the value of the next hidden state, h.

This image shows the components of an LSTM layer:

Compiler Interpretation

The compile method of the dlhdl.Workflow object translates the:

Forget gate into lstm.wf
Input gate into lstm.wi
Cell candidate into lstm.wg
Output gate into lstm.wo

The compile method

Inserts a depth concatenation layer between the layer preceding the LSTM layer and the gates of the LSTM layer.
Generates sigmoid, hyperbolic tangent, multiplication, and addition layers to replace the mathematical operations of the LSTM layer.

When the network has multiple stacked LSTM layers, the compile method uses the LSTM layer name when generating the translated instructions. For example, if the network has three stacked LSTM layers named lstm_1, lstm_2, and lstm_3, the compile method output is lstm_1.wi, lstm_1.wo, lstm_1.wg, lstm_1.wf, lstm_2.wi, and so on. The compiler schedules the different components of the LSTM layer such as fully connected layers, sigmoid blocks, tanh blocks, and so on, into different kernels within the deep learning processor architecture.

This image shows the graphical view of the compile method transformation of the LSTM layer:

To see the output of the compile method for an LSTM network, see Run Sequence-to-Sequence Classification on FPGAs by Using Deep Learning HDL Toolbox.