Sign in

Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be.

Frederik KunstnerJacques ChenJonathan Wilder LavingtonMark Schmidt
Published in: CoRR (2023)
Keyphrases
  • noisy data
  • random noise
  • data sets
  • noise level
  • additive noise
  • neural network
  • machine learning
  • least squares
  • denoising
  • missing data
  • signal to noise ratio
  • factor analysis
  • noise removal
  • low signal to noise ratio