DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.

Published in: OSDI (2024)

Keyphrases