Login / Signup
Keyformer: KV Cache reduction through key tokens selection for Efficient Generative Inference.
Muhammad Adnan
Akhil Arunkumar
Gaurav Jain
Prashant J. Nair
Ilya Soloveychik
Purushotham Kamath
Published in:
MLSys (2024)
Keyphrases
</>
unsupervised learning
neural network
databases
expert systems
data driven
cost effective
computationally expensive
selection algorithm