Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function.
Lingkai KongMolei TaoPublished in: NeurIPS (2020)
Keyphrases
- learning rate
- objective function
- multiscale
- error function
- cost function
- convergence rate
- update rule
- natural gradient
- weight vector
- learning algorithm
- rapid convergence
- adaptive learning rate
- optimal solution
- multi objective
- image segmentation
- multilayer neural networks
- hidden layer
- image processing
- training algorithm
- optimization problems
- activation function
- linear program
- constrained optimization
- linear programming
- regularization term
- global optimum
- convergence speed
- back propagation
- delta bar delta