Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.
Simon Garcia De GonzaloSitao HuangJuan Gómez-LunaSimon D. HammondOnur MutluWen-Mei HwuPublished in: CGO (2019)
Keyphrases
- parallel processing
- parallel programming
- automatically generate
- parallel architectures
- computational power
- high level
- highly parallel
- higher level
- lightweight
- multicore processors
- database
- lower level
- fine grained
- low level
- wireless sensor networks
- database systems
- information systems
- search engine
- genetic algorithm
- data sets
- real time