Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule.
Nikhil IyerV. ThejasNipun KwatraRamachandran RamjeeMuthian SivathanuPublished in: J. Mach. Learn. Res. (2023)
Keyphrases
- learning rate
- convergence rate
- learning algorithm
- error function
- hidden layer
- convergence speed
- multilayer neural networks
- scheduling problem
- rapid convergence
- weight vector
- activation function
- adaptive learning rate
- training algorithm
- neural network
- feature selection
- genetic algorithm ga
- search space
- step size
- global convergence
- natural gradient
- evolutionary algorithm