Sign in

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization.

Coleman HooperSehoon KimHiva MohammadzadehMichael W. MahoneyYakun Sophia ShaoKurt KeutzerAmir Gholami
Published in: CoRR (2024)
Keyphrases