Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks.
Yuanzhi LiColin WeiTengyu MaPublished in: NeurIPS (2019)
Keyphrases
- learning rate
- backpropagation algorithm
- training algorithm
- neural network
- multilayer neural networks
- adaptive learning rate
- hidden layer
- feed forward neural networks
- training speed
- activation function
- learning algorithm
- training process
- feedforward neural networks
- convergence rate
- convergence speed
- rapid convergence
- error function
- back propagation
- artificial neural networks
- convergence theorem
- fuzzy neural network
- neural network model
- delta bar delta
- weight vector
- multi layer perceptron
- neural nets
- recurrent neural networks
- training set
- data mining