Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect.
Yuqing WangMinshuo ChenTuo ZhaoMolei TaoPublished in: ICLR (2022)
Keyphrases
- learning rate
- convergence rate
- rapid convergence
- convergence theorem
- convergence speed
- adaptive learning rate
- update rule
- learning algorithm
- hidden layer
- weight update
- error function
- multilayer neural networks
- conjugate gradient algorithm
- training algorithm
- step size
- delta bar delta
- uniform convergence
- weight vector
- activation function
- neural network
- conjugate gradient
- primal dual
- particle swarm optimization
- multi class
- bp neural network algorithm