High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs.
William S. MosesIvan R. IvanovJens DomkeToshio EndoJohannes DoerfertOleksandr ZinenkoPublished in: CoRR (2022)
Keyphrases
- graphics processing units
- high level
- general purpose
- parallel processing
- pc cluster
- gpu implementation
- parallel computing
- parallel implementation
- highly parallel
- parallel computation
- parallel programming
- graphics hardware
- massively parallel
- low level
- graphics processors
- parallel architectures
- compute unified device architecture
- computing systems
- efficient implementation
- global optimization
- real time
- scientific computing
- floating point
- high performance computing
- programming language
- optimization problems
- distributed memory machines
- level parallelism
- memory bandwidth
- processing units
- parallel computers
- shared memory
- multithreading
- distributed memory
- combinatorial optimization
- optimization method
- optimization process
- neural network
- heterogeneous computing
- cpu implementation
- evolutionary algorithm
- higher level
- optimization algorithm
- multi core processors
- parallel machines
- coarse grained
- data partitioning