Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs.

Aodong Chen Fei Xu Li Han Yuan Dong Li Chen Zhi Zhou Fangming Liu

Published in: CoRR (2023)

Keyphrases

parallel processing
computational power
inference process
parallel architectures
multi core systems
general purpose
commodity hardware
bayesian inference
bayesian networks
knowledge representation
belief networks
training process
single instruction multiple data
neural network
inference mechanism
bayesian model
probabilistic inference
graphical models
database systems