Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation.
Ammar Ahmad AwanJeroen BédorfChing-Hsiang ChuHari SubramoniDhabaleswar K. PandaPublished in: CoRR (2018)
Keyphrases
- scalable distributed
- parallel implementation
- training process
- general purpose
- parallel computing
- training set
- parallel algorithm
- artificial neural networks
- parallel programming
- data sets
- text classification
- real time
- file system
- message passing
- shared memory
- machine learning
- training samples
- test set
- training algorithm
- pairwise
- training phase