Login / Signup
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction.
Haoran Qiu
Weichao Mao
Archit Patke
Shengkun Cui
Saurabh Jha
Chen Wang
Hubertus Franke
Zbigniew T. Kalbarczyk
Tamer Basar
Ravishankar K. Iyer
Published in:
CoRR (2024)
Keyphrases
</>
real time
prediction accuracy
data sets
neural network
machine learning
information retrieval
computer vision
information systems
decision trees
data driven
computer graphics
cost effective