Login / Signup

On the Role of Attention Masks and LayerNorm in Transformers.

Xinyi WuAmir AjorlouYifei WangStefanie JegelkaAli Jadbabaie
Published in: CoRR (2024)
Keyphrases
  • real time
  • visual attention
  • social networks
  • neural network
  • support vector
  • information technology