Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay.
Konstantinos ParasyrisGiorgis GeorgakoudisEsteban RangelIgnacio LagunaJohannes DoerfertPublished in: SC (2023)
Keyphrases
- parallel programming
- graphics processing units
- parallel computing
- commodity hardware
- shared memory
- real time
- kernel methods
- support vector
- kernel function
- memory efficient
- parallel computation
- low overhead
- general purpose
- parallel implementation
- high performance computing
- multi processor
- parallel processing
- convolution kernel
- mutual subspace method
- highly scalable
- component analysis
- kernel pca
- fine tuning
- kernel learning
- kernel parameters
- tuning parameters
- gpu implementation
- gaussian processes
- rule selection
- lightweight
- feature space
- database