Entropic gradient descent algorithms and wide flat minima.
Fabrizio PittorinoCarlo LucibelloChristoph FeinauerGabriele PeruginiCarlo BaldassiElizaveta DemyanenkoRiccardo ZecchinaPublished in: ICLR (2021)
Keyphrases
- orders of magnitude
- significant improvement
- computational cost
- recently developed
- learning algorithm
- data structure
- artificial neural networks
- real time
- mutual information
- benchmark datasets
- worst case
- conjugate gradient
- computationally expensive
- loss function
- scale space
- optimization problems
- semi supervised
- genetic algorithm
- neural network