Login / Signup

ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference.

Hyungjun OhKihong KimJaemin KimSungkyun KimJunyeol LeeDu-seong ChangJiwon Seo
Published in: ASPLOS (2) (2024)
Keyphrases
  • resource scheduling
  • load balancing
  • grid systems
  • database systems
  • quality management