OD-SGD: One-step Delay Stochastic Gradient Descent for Distributed Training.
Yemao XuDezun DongWeixia XuXiangke LiaoPublished in: CoRR (2020)
Keyphrases
- stochastic gradient descent
- least squares
- loss function
- matrix factorization
- step size
- early stopping
- training speed
- stochastic gradient
- random forests
- support vector machine
- linear svm
- regularization parameter
- weight vector
- alternating least squares
- online algorithms
- importance sampling
- multiple kernel learning
- logistic regression
- support vector
- pairwise
- svm solvers
- cross validation
- missing data
- machine learning