Login / Signup
1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs.
Frank Seide
Hao Fu
Jasha Droppo
Gang Li
Dong Yu
Published in:
INTERSPEECH (2014)
Keyphrases
</>
data sets
training data
small number
training samples
stochastic gradient descent
machine learning
lower bound
cost function
data points
supervised learning
input space
random forests
training dataset
parallel distributed