Login / Signup
An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs.
Shixiong Xu
David Gregg
Published in:
PACT (2015)
Keyphrases
</>
level parallelism
general purpose
parallel programming
multi core processors
gpu implementation
memory bandwidth
graphics hardware
parallel processing
parallel implementation
real time
efficient implementation
computational power
graphics processing units
instruction set