WNGrad: Learn the Learning Rate in Gradient Descent.
Xiaoxia WuRachel WardLéon BottouPublished in: CoRR (2018)
Keyphrases
- learning rate
- error function
- update rule
- natural gradient
- learning algorithm
- convergence rate
- adaptive learning rate
- rapid convergence
- hidden layer
- convergence speed
- weight vector
- training algorithm
- convergence theorem
- loss function
- bp neural network algorithm
- delta bar delta
- activation function
- linear programming
- cost function
- multi objective