Login / Signup

SparQ Attention: Bandwidth-Efficient LLM Inference.

Luka RibarIvan ChelombievLuke Hudlass-GalleyCharlie BlakeCarlo LuschiDouglas Orr
Published in: CoRR (2023)
Keyphrases
  • computationally efficient
  • databases
  • focus of attention
  • neural network
  • data mining
  • information systems
  • data structure
  • cost effective
  • visual attention
  • efficient learning