Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent.
Shuheng ShenLinli XuJingchang LiuXianfeng LiangYifei ChengPublished in: IJCAI (2019)
Keyphrases
- stochastic gradient descent
- early stopping
- least squares
- loss function
- step size
- matrix factorization
- svm training
- random forests
- support vector machine
- decision trees
- regularization parameter
- weight vector
- online algorithms
- importance sampling
- supervised learning
- training set
- linear svm
- multiple kernel learning
- convergence rate
- logistic regression
- missing data
- online learning
- collaborative filtering
- small number
- support vector