Login / Signup
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving.
Yinwei Dai
Rui Pan
Anand P. Iyer
Kai Li
Ravi Netravali
Published in:
CoRR (2023)
Keyphrases
</>
low latency
response time
resource utilization
maximum likelihood
high speed
high throughput
prefetching
real time
virtual machine
learning processes