Login / Signup
Mixture of Attention Heads: Selecting Attention Heads Per Token.
Xiaofeng Zhang
Yikang Shen
Zeyu Huang
Jie Zhou
Wenge Rong
Zhang Xiong
Published in:
CoRR (2022)
Keyphrases
</>
visual attention
information retrieval
special case
neural network
similarity measure
bayesian networks
reinforcement learning
multi agent
digital libraries
expert systems
probabilistic model
medical images
mixture model
focus of attention
selective attention