Login / Signup

SPLIT: QoS-Aware DNN Inference on Shared GPU via Evenly-Sized Model Splitting.

Diaohan LuoTian YuYuewen WuHeng WuTao WangWenbo Zhang
Published in: ICPP (2023)
Keyphrases
  • qos aware
  • real time
  • management system
  • database systems
  • response time
  • peer to peer