Login / Signup
On Avoiding Local Minima Using Gradient Descent With Large Learning Rates.
Amirkeivan Mohtashami
Martin Jaggi
Sebastian U. Stich
Published in:
CoRR (2022)
Keyphrases
</>
learning rate
error function
update rule
learning algorithm
convergence rate
cost function
global minimum
uniform convergence
convergence speed
covering numbers
convergence theorem
gaussian kernels
objective function
loss function
simulated annealing
weight vector
upper bound
lower bound