Reducing GPU offload latency via fine-grained CPU-GPU synchronization.
Daniel LustigMargaret MartonosiPublished in: HPCA (2013)
Keyphrases
- fine grained
- heterogeneous computing
- graphics processing units
- coarse grained
- graphics processors
- gpu implementation
- massively parallel
- real time
- memory bandwidth
- data transfer
- access control
- graphics hardware
- general purpose
- multithreading
- parallel implementation
- parallel computing
- compute intensive
- parallel processing
- high performance computing
- parallel computation
- tightly coupled
- computing systems
- level parallelism
- keywords