Login / Signup
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time.
Zichang Liu
Aditya Desai
Fangshuo Liao
Weitao Wang
Victor Xie
Zhaozhuo Xu
Anastasios Kyrillidis
Anshumali Shrivastava
Published in:
CoRR (2023)
Keyphrases
</>
neural network
data compression
query processing
image compression
databases
computational intelligence
data access
compression ratio
relative importance
data structure
main memory
compression scheme
prefetching
back end
transmission line
cache management