Login / Signup
ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference.
Hyungjun Oh
Kihong Kim
Jaemin Kim
Sungkyun Kim
Junyeol Lee
Du-seong Chang
Jiwon Seo
Published in:
ASPLOS (2) (2024)
Keyphrases
</>
resource scheduling
load balancing
grid systems
database systems
quality management