ssd-sgd: communication sparsification for distributed deep learning training.
Yemao XuDezun DongYawei ZhaoWeixia XuXiangke LiaoPublished in: CoRR (2020)
Keyphrases
- deep learning
- deep architectures
- restricted boltzmann machine
- unsupervised learning
- stochastic gradient descent
- unsupervised feature learning
- least squares
- training examples
- machine learning
- deep belief networks
- supervised learning
- training set
- mental models
- pattern recognition
- training data
- probabilistic model
- weakly supervised
- multiscale
- feature selection