CoFB: latency-constrained co-scheduling of flows and batches for deep learning inference service on the CPU-GPU system.
Qi ZhangYi LiuTao LiuDepei QianPublished in: J. Supercomput. (2023)
Keyphrases
- deep learning
- heterogeneous computing
- unsupervised learning
- data transfer
- unsupervised feature learning
- grid computing
- machine learning
- gpu implementation
- memory bandwidth
- deep architectures
- graphics processing units
- weakly supervised
- graphics processors
- parallel computing
- mental models
- bayesian networks
- differentiated services
- response time
- text mining