Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processors.
Chongxiao CaoMark GatesAzzam HaidarPiotr LuszczekStanimire TomovIchitaro YamazakiJack J. DongarraPublished in: ScalA@SC (2014)
Keyphrases
- multicore processors
- parallel programming
- graphics processing units
- parallel algorithm
- parallel architectures
- high performance computing
- highly parallel
- computing systems
- massively parallel
- computing power
- parallel computing
- operating system
- message passing interface
- scientific computing
- general purpose
- response time
- parallel processing
- parallel computation
- computer systems
- shared memory
- high end
- processing units
- programming environment
- cloud computing
- database systems
- smart card
- real time
- efficient implementation
- fault tolerance
- information systems
- computing platform
- computer architecture
- single chip
- software engineering