Login / Signup

ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference.

Hyungjun OhKihong KimJaemin KimSungkyun KimJunyeol LeeDu-seong ChangJiwon Seo
Published in: CoRR (2024)
Keyphrases
  • resource scheduling
  • load balancing
  • grid systems
  • quality management
  • business processes