A generic communication scheduler for distributed DNN training acceleration.
Yanghua PengYibo ZhuYangrui ChenYixin BaoBairen YiChang LanChuan WuChuanxiong GuoPublished in: SOSP (2019)
Keyphrases
- training process
- communication overhead
- communication cost
- distributed control
- spatially distributed
- distributed computation
- distributed systems
- computer networks
- communication networks
- training algorithm
- fully distributed
- domain specific
- distributed environment
- exchange information
- communication protocol
- multi party
- neural network
- multi agent
- single point of failure
- computing environments
- training examples
- lightweight
- information dissemination
- concurrent processes
- peer to peer
- group communication
- distributed network
- distributed teams
- hearing impaired
- resource manager
- multi robot coordination
- global knowledge
- remote sites
- network nodes
- training phase
- communication channels
- distributed computing
- online learning
- supervised learning
- training data