Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks.

Yuanzhi Li Colin Wei Tengyu Ma

Published in: NeurIPS (2019)

Keyphrases

learning rate
backpropagation algorithm
training algorithm
neural network
multilayer neural networks
adaptive learning rate
hidden layer
feed forward neural networks
training speed
activation function
learning algorithm
training process
feedforward neural networks
convergence rate
convergence speed
rapid convergence
error function
back propagation
artificial neural networks
convergence theorem
fuzzy neural network
neural network model
delta bar delta
weight vector
multi layer perceptron
neural nets
recurrent neural networks
training set
data mining