Offloaded GPU Collectives Using CORE-Direct and CUDA Capabilities on InfiniBand Clusters.
Akshay VenkateshKhaled HamidoucheHari SubramoniDhabaleswar K. PandaPublished in: HiPC (2015)
Keyphrases
- parallel implementation
- gpu implementation
- graphics processors
- gpu accelerated
- graphics hardware
- parallel computing
- parallel computation
- clustering algorithm
- real time
- general purpose
- hierarchical clustering
- collective intelligence
- parallel programming
- unsupervised clustering
- graph clustering
- cluster analysis
- data clustering
- self organizing maps
- parallel processing
- arbitrary shape
- computational power
- graphics processing units
- compute unified device architecture
- times faster
- cpu implementation
- image segmentation
- feature space
- clustering method
- data distribution
- processing capabilities
- parallel algorithm
- processing units
- high performance computing