oclCUB: an OpenCL parallel computing library for deep learning operators.
Changqing ShiYufei SunYicheng SuiYuqiao ChenHaotian WangYuzhi ZhangPublished in: CCF Trans. High Perform. Comput. (2024)
Keyphrases
- parallel computing
- deep learning
- shared memory
- parallel programming
- graphics processing units
- massively parallel
- high performance computing
- unsupervised feature learning
- computing systems
- unsupervised learning
- mental models
- parallel computation
- machine learning
- parallel execution
- parallel architectures
- message passing
- parallel machines
- transactional memory
- parallel computers
- parallel algorithm
- processing units
- weakly supervised
- pairwise
- field programmable gate array
- data points