Login / Signup
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training.
Yijia Zhang
Yibo Han
Shijie Cao
Guohao Dai
Youshan Miao
Ting Cao
Fan Yang
Ningyi Xu
Published in:
CoRR (2023)
Keyphrases
</>
training process
training set
small scale
real life
supervised learning
test set
classifier training
computing power
real world
online learning
training data
artificial neural networks
serious games
significantly reduced
training phase
information retrieval
web scale
data mining