Login / Signup

Optimizing Speculative Decoding for Serving Large Language Models Using Goodput.

Xiaoxuan LiuCade DanielLangxiang HuWoosuk KwonZhuohan LiXiangxi MoAlvin CheungZhijie DengIon StoicaHao Zhang
Published in: CoRR (2024)
Keyphrases