Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence.
Nicolas LoizouSharan VaswaniIssam Hadj LaradjiSimon Lacoste-JulienPublished in: AISTATS (2021)
Keyphrases
- step size
- learning rate
- convergence rate
- stochastic gradient descent
- convergence speed
- rapid convergence
- faster convergence
- global convergence
- convergence theorem
- variable step size
- update rule
- line search
- adaptive learning rate
- gradient method
- primal dual
- weight vector
- steepest descent method
- global optimum
- convergence analysis
- differential evolution
- hessian matrix
- pairwise
- neural network
- training algorithm
- learning algorithm