Login / Signup
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition.
Lu Ye
Ze Tao
Yong Huang
Yang Li
Published in:
CoRR (2024)
Keyphrases
</>
computationally efficient
cost effective
data structure
computationally expensive
learning algorithm
multi dimensional
lightweight