Performance analysis of memory transfers and GEMM subroutines on NVIDIA Tesla GPU cluster.
Veerendra AlladaTroy BenjegerdesBrett M. BodePublished in: CLUSTER (2009)
Keyphrases
- graphics processors
- graphics processing units
- parallel implementation
- graphics hardware
- gpu implementation
- clustering algorithm
- real time
- data clustering
- hierarchical clustering
- cpu implementation
- memory usage
- memory requirements
- general purpose
- data points
- parallel computing
- cluster centers
- memory space
- memory bandwidth
- computing systems
- data transfer
- computing power
- parallel algorithm
- clustering framework
- memory size
- gpu accelerated
- data structure