CudaDMA: optimizing GPU memory bandwidth via warp specialization.
Michael BauerHenry CookBrucek KhailanyPublished in: SC (2011)
Keyphrases
- computer systems
- memory bandwidth
- hardware and software
- processing power
- level parallelism
- floating point
- parallel programming
- processing units
- commodity hardware
- single instruction multiple data
- real time
- high end
- parallel algorithm
- parallel computing
- shared memory
- parallel processing
- cache misses
- limited resources
- memory access
- management system
- data model
- database