FastHorovod: Expediting Parallel Message-Passing Schedule for Distributed DNN Training.
Yanghai WangDezun DongYemao XuShuo OuyangXiangke LiaoPublished in: ISCC (2021)
Keyphrases
- message passing
- distributed systems
- shared memory
- training process
- distributed shared memory
- message passing interface
- interconnection networks
- belief propagation
- approximate inference
- probabilistic inference
- inference in graphical models
- scheduling problem
- factor graphs
- parallel execution
- parallel programming
- sum product
- parallel computing
- parallel processing
- markov random field
- distributed memory
- parallel algorithm
- sum product algorithm
- ldpc codes
- training data
- structured prediction
- active contours
- image sequences
- image processing