Dissecting the NVidia Turing T4 GPU via Microbenchmarking.
Zhe JiaMarco MaggioniJeffrey SmithDaniele Paolo ScarpazzaPublished in: CoRR (2019)
Keyphrases
- graphics processing units
- graphics processors
- parallel implementation
- graphics hardware
- gpu implementation
- general purpose
- compute unified device architecture
- cpu implementation
- real time
- machine intelligence
- parallel computing
- parallel processing
- turing machine
- parallel algorithm
- parallel computation
- efficient implementation
- highly parallel
- massively parallel
- databases
- video sequences
- multiscale
- floating point
- parallel programming
- bayesian networks
- parallel architectures
- times faster
- shared memory
- learning algorithm
- machine learning
- ray casting
- gpu accelerated
- neural network
- computing systems