Login / Signup
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving.
Yinmin Zhong
Shengyu Liu
Junda Chen
Jianbo Hu
Yibo Zhu
Xuanzhe Liu
Xin Jin
Hao Zhang
Published in:
OSDI (2024)
Keyphrases
</>
language model
language modeling
document retrieval
n gram
query expansion
probabilistic model
information retrieval
speech recognition
test collection
retrieval model
language modelling
ad hoc information retrieval
mixture model
query terms
statistical language models
translation model
smoothing methods
language models for information retrieval
context sensitive
word error rate
relevance model
document collections
query specific
statistical machine translation
cross lingual
retrieval effectiveness
error rate
language model for information retrieval