Marthe: Scheduling the Learning Rate Via Online Hypergradients.
Michele DoniniLuca FranceschiOrchid MajumderMassimiliano PontilPaolo FrasconiPublished in: IJCAI (2020)
Keyphrases
- learning rate
- convergence rate
- learning algorithm
- error function
- online learning
- hidden layer
- scheduling problem
- rapid convergence
- convergence speed
- multilayer neural networks
- adaptive learning rate
- bp neural network algorithm
- weight vector
- neural network
- training algorithm
- delta bar delta
- global optimization
- convergence theorem
- global convergence
- activation function
- training samples
- step size
- scheduling algorithm