CUDA-NP: Realizing Nested Thread-Level Parallelism in GPGPU Applications.
Yi YangChao LiHuiyang ZhouPublished in: J. Comput. Sci. Technol. (2015)
Keyphrases
- level parallelism
- compute unified device architecture
- graphics processing units
- gpu implementation
- parallel algorithm
- parallel processing
- shared memory
- general purpose
- parallel implementation
- instruction set
- parallel computing
- multi core processors
- computational complexity
- parallel programming
- floating point
- real time
- message passing