Sai

Last seen: 4 months 前 | 自 2025 起处于活动状态

Followers: 0 Following: 0

统计学

Feeds

提问

How do I apply a padding mask(B*T) variable into a training of a self-attention transformer decoder, from a 3D Matrix(B*T*C)
While attempting to train my neural network that uses a self attention layer in its transformer block. I have been struggling to...

7 months 前 | 1 个回答 | 0

1

个回答