Login / Signup
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Zixiang Chen
Junkai Zhang
Yiwen Kou
Xiangning Chen
Cho-Jui Hsieh
Quanquan Gu
Published in:
NeurIPS (2023)
Keyphrases
</>
objective function
stochastic gradient descent
visual perception
neural network
information content
convex optimization
anisotropic diffusion
quality assessment
edge preserving
geometric interpretation