Login / Signup
Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?
Ammar Ahmad Awan
Ching-Hsiang Chu
Hari Subramoni
Dhabaleswar K. Panda
Published in:
EuroMPI (2018)
Keyphrases
</>
deep learning
parallel implementation
parallel computing
clustering algorithm
unsupervised feature learning
unsupervised learning
parallel programming
machine learning
data mining
high performance computing
training set
document clustering
weakly supervised
feature selection
deep architectures