Login / Signup

DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.

Yinmin ZhongShengyu LiuJunda ChenJianbo HuYibo ZhuXuanzhe LiuXin JinHao Zhang
Published in: CoRR (2024)
Keyphrases