Login / Signup

InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management.

Wonbeom LeeJungi LeeJunghwan SeoJaewoong Sim
Published in: CoRR (2024)
Keyphrases