Multi-stage Gradient Compression: Overcoming the Communication Bottleneck in Distributed Deep Learning.
Qu LuWantao LiuJizhong HanJinrong GuoPublished in: ICONIP (1) (2018)
Keyphrases
- multistage
- deep learning
- interconnection networks
- single point of failure
- single stage
- unsupervised learning
- unsupervised feature learning
- stochastic programming
- lot sizing
- dynamic programming
- mental models
- stochastic optimization
- optimal policy
- deep architectures
- weakly supervised
- machine learning
- maximum likelihood
- information extraction
- viewpoint
- computer vision