Sign in

Width of Minima Reached by Stochastic Gradient Descent is Influenced by Learning Rate to Batch Size Ratio.

Stanislaw JastrzebskiZachary KentonDevansh ArpitNicolas BallasAsja FischerYoshua BengioAmos J. Storkey
Published in: ICANN (3) (2018)
Keyphrases