A model-driven approach to warp/thread-block level GPU cache bypassing.
Hongwen DaiChao LiHuiyang ZhouSaurabh GuptaChristos KartsaklisMike MantorPublished in: DAC (2016)
Keyphrases
- graphics hardware
- real time
- prefetching
- multithreading
- data access
- parallel computation
- main memory
- parallel implementation
- metamodel
- hit rate
- gpu implementation
- memory hierarchy
- cache management
- hit ratio
- query processing
- gpu accelerated
- graphics processing units
- memory bandwidth
- parallel computing
- access patterns
- cluster of workstations
- embedded processors
- graphics processors
- response time
- cache misses
- parallel programming
- cache replacement
- finer granularity
- web caching
- operating system
- medical images
- general purpose
- search engine