Sign in

Paella: Low-latency Model Serving with Software-defined GPU Scheduling.

Kelvin K. W. NgHenri Maxime DemoulinVincent Liu
Published in: SOSP (2023)
Keyphrases
  • low latency
  • real time
  • relational databases
  • end to end