Login / Signup

Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters.

Zhiyu GuoHidetaka KamigaitoTaro Watanabe
Published in: CoRR (2024)
Keyphrases
  • data access
  • relative importance
  • main memory
  • visual attention
  • response time
  • input output
  • prefetching
  • focus of attention
  • cache management