Login / Signup
Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult.
Yuqing Wang
Zhenghao Xu
Tuo Zhao
Molei Tao
Published in:
CoRR (2023)
Keyphrases
</>
learning rate
uniform convergence
learning algorithm
convergence rate
error function
convergence speed
hidden layer
edge detection
adaptive learning rate
multilayer neural networks
training algorithm
rapid convergence
activation function
weight vector