The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure.
Rong GeSham M. KakadeRahul KidambiPraneeth NetrapalliPublished in: CoRR (2019)
Keyphrases
- learning rate
- learning algorithm
- convergence rate
- error function
- rapid convergence
- hidden layer
- convergence speed
- scheduling problem
- adaptive learning rate
- multilayer neural networks
- neural network
- activation function
- convergence theorem
- principal components
- training samples
- high accuracy
- middle layer
- support vector machine
- machine learning