Is LSTM and fully connected networks changing channels or neurons？

Question

秀新 2023-9-9

0
链接

此问题的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2018821-is-lstm-and-fully-connected-networks-changing-channels-or-neurons

回答：秀新 2023-9-21

When I was building the network, I was surprised to find out why LSTM and fully connected networks change the number of channels instead of neurons. My input is a one-dimensional signal (1024 sampling points), which is 1 (C) when analyzing the network × 1 (B) × 1024 (T), when passing through BiLSTM, the channel changes while the others remain unchanged, which is strange and inconsistent with theory. Additionally, how can MATLAB build an SE attention module? The multiplication layer has no broadcasting function and can only be element by element, so 1 × one × How C is related to H × W × What about C multiplied by? Thank you all for your guidance！

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

请先登录，再回答此问题。

Answer 1

Ben 2023-9-20

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2018821-is-lstm-and-fully-connected-networks-changing-channels-or-neurons#answer_1314387

We use "channels" or C to refer to the feature dimension - in the case of LSTM, BiLSTM, GRU I think of the operation as a loop over a sequence of vector inputs - each vector has size C. In practice we have a multi-dimensional array with size CxBxT - the operation loops over the sequence T dimension, vectorizes over the batch B dimension, and uses the C dimension in the computation.

Perhaps the confusion is that often LSTM-s have their output feedback as their input, but this is not necessary, so you can have an LSTM with inputs as vectors of size C1 and outputs of size C2. You can find a description of the algorithm in the "Algorithms" section of the doc page: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.lstmlayer.html or you can see that the input state dimension and output/hidden state dimensions can be different in the algorithm described here https://en.wikipedia.org/wiki/Long_short-term_memory

Could you clarify what an SE attention module is?

We can implement attention mechanisms that require batch-matrix multiplies by using pagemtimes: https://www.mathworks.com/help/matlab/ref/pagemtimes.html - this is supported as a dlarray method, so it can be used in autodiff workflows such as custom training loops, and implementing custom layers. Alternatively if you need standard self attention you can use selfAttentionLayer: https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.selfattentionlayer.html

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Answer 2

秀新 2023-9-21

0
链接

此回答的直接链接

https://ww2.mathworks.cn/matlabcentral/answers/2018821-is-lstm-and-fully-connected-networks-changing-channels-or-neurons#answer_1314732

Firstly, thank you very much for your answer. Below, I will explain the SE attention module. The SE attention module was published in 2018 with the title "Squeeze and Extraction Networks". When the input is a sequence, according to the network structure of this article, MATLAB cannot build it. I just tried changing global pooling to average pooling, which can achieve similar results.

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

请先登录，再进行评论。

Is LSTM and fully connected networks changing channels or neurons？

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

更多回答（1 个）

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

Is LSTM and fully connected networks changing channels or neurons？

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

采纳的回答

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

更多回答（1 个）

0 个评论 显示 -2更早的评论隐藏 -2更早的评论

另请参阅

类别

标签

产品

版本

Community Treasure Hunt

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论

0 个评论
显示 -2更早的评论隐藏 -2更早的评论