Efficient implementation of OpenACC cache directive on NVIDIA GPUs.
Ahmad LashgarAmirali BaniasadiPublished in: Int. J. High Perform. Comput. Netw. (2019)
Keyphrases
- efficient implementation
- graphics processing units
- graphics hardware
- highly parallel
- gpu implementation
- compute unified device architecture
- parallel programming
- prefetching
- parallel computation
- graphics processors
- data access
- hardware implementation
- efficient processing
- shared memory multiprocessor
- active set
- general purpose computing
- cache misses
- cpu implementation
- main memory
- general purpose
- query processing
- processing units
- high end
- parallel architectures
- multi dimensional
- real time
- parallel computing
- scientific computing
- multithreading
- parallel implementation
- data management
- response time