Login / Signup
Adafactor: Adaptive Learning Rates with Sublinear Memory Cost.
Noam Shazeer
Mitchell Stern
Published in:
CoRR (2018)
Keyphrases
</>
learning rate
gaussian kernels
convergence rate
learning algorithm
uniform convergence
error function
covering numbers
cost sensitive
convergence theorem
feature selection