Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent.
Kangqiao LiuZiyin LiuMasahito UedaPublished in: ICML (2021)
Keyphrases
- learning rate
- stochastic gradient descent
- weight vector
- training speed
- convergence rate
- step size
- convergence speed
- learning algorithm
- least squares
- loss function
- matrix factorization
- noise level
- missing data
- support vector machine
- perceptron algorithm
- monte carlo
- multi class
- random forests
- training data
- image sequences