On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima.
Nitish Shirish KeskarDheevatsa MudigereJorge NocedalMikhail SmelyanskiyPing Tak Peter TangPublished in: CoRR (2016)
Keyphrases
- deep learning
- deep architectures
- restricted boltzmann machine
- unsupervised learning
- unsupervised feature learning
- machine learning
- deep belief networks
- training set
- mental models
- training examples
- computer vision
- pattern recognition
- training samples
- viewpoint
- feature extraction
- learning strategies
- weakly supervised
- text classification
- probabilistic model