Login / Signup

Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction.

Haoran QiuWeichao MaoArchit PatkeShengkun CuiSaurabh JhaChen WangHubertus FrankeZbigniew T. KalbarczykTamer BasarRavishankar K. Iyer
Published in: CoRR (2024)
Keyphrases
  • real time
  • prediction accuracy
  • data sets
  • neural network
  • machine learning
  • information retrieval
  • computer vision
  • information systems
  • decision trees
  • data driven
  • computer graphics
  • cost effective