Login / Signup
Mixture of Attention Heads: Selecting Attention Heads Per Token.
Xiaofeng Zhang
Yikang Shen
Zeyu Huang
Jie Zhou
Wenge Rong
Zhang Xiong
Published in:
EMNLP (2022)
Keyphrases
</>
data sets
visual attention
search algorithm
focus of attention
neural network
social networks
information systems
preprocessing
computer simulation