Effect of Random Learning Rate: Theoretical Analysis of SGD Dynamics in Non-Convex Optimization via Stationary Distribution.
Naoki YoshidaShogo NakakitaMasaaki ImaizumiPublished in: CoRR (2024)
Keyphrases
- convex optimization
- learning rate
- stationary distribution
- markov chain
- learning algorithm
- convergence rate
- random walk
- initial state
- total variation
- transition probabilities
- convergence speed
- primal dual
- augmented lagrangian
- dynamical systems
- steady state
- queueing networks
- weight vector
- sufficient conditions
- state space
- multiscale
- queue length
- stochastic gradient descent
- multiresolution
- image processing
- graphical models
- higher order
- service times
- genetic algorithm
- multi class