Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes.
Ohad ShamirTong ZhangPublished in: ICML (1) (2013)
Keyphrases
- stochastic gradient descent
- stochastic gradient
- early stopping
- number of iterations required
- step size
- least squares
- loss function
- matrix factorization
- random forests
- convergence rate
- support vector machine
- worst case
- online algorithms
- regularization parameter
- optimal solution
- weight vector
- convergence speed
- global optimization
- optimization algorithm
- learning rate
- image processing
- multiple kernel learning
- asymptotically optimal
- cost function
- recommender systems
- pairwise
- particle swarm optimization