Login / Signup
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention.
Tianyi Zhang
Jonah Wonkyu Yi
Bowen Yao
Zhaozhuo Xu
Anshumali Shrivastava
Published in:
CoRR (2024)
Keyphrases
</>
visual attention
focus of attention
computer vision
search algorithm
neural network
information systems
decision making
bayesian networks
computationally efficient
computationally expensive
efficient learning