Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning.
Masuma Akter RumiXiaolong MaYanzhi WangPeng JiangPublished in: PACT (2020)
Keyphrases
- probabilistic inference
- cellular neural networks
- high dimensional
- search space
- avoid overfitting
- sparse data
- inference process
- bayesian networks
- tree pruning
- dictionary learning
- convolutional neural network
- pruning algorithms
- covariance function
- pruning method
- compressed sensing
- inference engine
- bayesian inference
- general purpose
- search algorithm
- compressive sensing
- gaussian processes
- parallel programming
- graphics hardware
- parallel processing
- fixed point
- cloud computing
- graphical models