Login / Signup

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.

Xupeng MiaoGabriele OliaroZhihao ZhangXinhao ChengZeyu WangRae Ying Yee WongZhuoming ChenDaiyaan ArfeenReyna AbhyankarZhihao Jia
Published in: CoRR (2023)
Keyphrases