SSD-SGD: Communication Sparsification for Distributed Deep Learning Training.
Yemao XuDezun DongDongsheng WangShi XuEnda YuWeixia XuXiangke LiaoPublished in: ACM Trans. Archit. Code Optim. (2023)
Keyphrases
- deep learning
- deep architectures
- restricted boltzmann machine
- stochastic gradient descent
- unsupervised feature learning
- unsupervised learning
- machine learning
- deep belief networks
- least squares
- weakly supervised
- supervised learning
- online learning
- mental models
- training set
- training samples
- pattern recognition
- information extraction
- feature vectors
- multiscale
- face recognition