Login / Signup
Understanding the Generalization Benefits of Late Learning Rate Decay.
Yinuo Ren
Chao Ma
Lexing Ying
Published in:
CoRR (2024)
Keyphrases
</>
learning rate
learning algorithm
convergence rate
error function
hidden layer
rapid convergence
convergence speed
activation function
adaptive learning rate
convergence theorem
training algorithm
delta bar delta
multilayer neural networks
uniform convergence
weight vector
dynamic programming
step size