CUDA-NP: realizing nested thread-level parallelism in GPGPU applications.
Yi YangHuiyang ZhouPublished in: PPOPP (2014)
Keyphrases
- level parallelism
- compute unified device architecture
- parallel algorithm
- graphics processing units
- shared memory
- parallel processing
- multi core processors
- instruction set
- computational complexity
- parallel computing
- parallel implementation
- general purpose
- parallel programming
- graphics hardware
- gpu implementation
- parallel computation
- database systems
- general purpose computing