SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation.

Robert M. Gower Othmane Sebbouh Nicolas Loizou

Published in: AISTATS (2021)

Keyphrases

learning rate
convergence rate
covering numbers
learning algorithm
gaussian kernels
uniform convergence
weight vector
stochastic gradient descent
vc dimension
convex optimization
learning theory
convergence speed
global optimization
special case
particle swarm optimization
upper bound
evolutionary algorithm
feature selection