Login / Signup
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation.
Akhilesh Gotmare
Nitish Shirish Keskar
Caiming Xiong
Richard Socher
Published in:
ICLR (Poster) (2019)
Keyphrases
</>
learning rate
deep learning
unsupervised learning
convergence rate
learning algorithm
machine learning
unsupervised feature learning
rapid convergence
hidden layer
adaptive learning rate
convergence speed
weakly supervised
delta bar delta
mental models
feature vectors
deep architectures
convergence theorem
pairwise