Login / Signup
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition.
Lu Ye
Ze Tao
Yong Huang
Yang Li
Published in:
ACL (1) (2024)
Keyphrases
</>
real time
data structure
cost effective
data sets
case study