Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent.
Guangzeng XieHao JinDachao LinZhihua ZhangPublished in: CoRR (2021)
Keyphrases
- learning rate
- error function
- update rule
- convergence rate
- natural gradient
- covering numbers
- learning algorithm
- adaptive learning rate
- convergence speed
- hidden layer
- uniform convergence
- training algorithm
- rapid convergence
- cost function
- multilayer neural networks
- activation function
- conjugate gradient
- weight vector
- objective function