Login / Signup

SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.

Xupeng MiaoGabriele OliaroZhihao ZhangXinhao ChengZeyu WangZhengxin ZhangRae Ying Yee WongAlan ZhuLijie YangXiaoxiang ShiChunan ShiZhuoming ChenDaiyaan ArfeenReyna AbhyankarZhihao Jia
Published in: ASPLOS (3) (2024)
Keyphrases