Login / Signup

Why Does Large Batch Training Result in Poor Generalization? A Comprehensive Explanation and a Better Strategy from the Viewpoint of Stochastic Optimization.

Tomoumi TakaseSatoshi OyamaMasahito Kurihara
Published in: Neural Comput. (2018)
Keyphrases
  • stochastic optimization
  • viewpoint
  • multistage
  • batch mode
  • decision making
  • cooperative
  • training set
  • special case
  • training examples
  • training process
  • np hard
  • probability distribution