OD-SGD: One-Step Delay Stochastic Gradient Descent for Distributed Training.
Yemao XuDezun DongYawei ZhaoWeixia XuXiangke LiaoPublished in: ACM Trans. Archit. Code Optim. (2020)
Keyphrases
- stochastic gradient descent
- least squares
- loss function
- matrix factorization
- stochastic gradient
- early stopping
- training speed
- step size
- random forests
- online algorithms
- support vector machine
- multiple kernel learning
- weight vector
- importance sampling
- regularization parameter
- alternating least squares
- support vector
- nonnegative matrix factorization
- collaborative filtering
- semi supervised
- cost function