Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning.
Libin ZhuChaoyue LiuAdityanarayanan RadhakrishnanMikhail BelkinPublished in: CoRR (2023)
Keyphrases
- online learning
- learning algorithm
- supervised learning
- stochastic gradient descent
- active learning
- learning machines
- machine learning
- learning speed
- computer software
- motor skills
- neural network
- feedforward neural networks
- inductive inference
- learning problems
- learning systems
- knowledge acquisition
- learning process
- learning tasks
- machine learning algorithms
- unsupervised learning
- learning stage
- reinforcement learning