Login / Signup
On parallelizability of stochastic gradient descent for speech DNNS.
Frank Seide
Hao Fu
Jasha Droppo
Gang Li
Dong Yu
Published in:
ICASSP (2014)
Keyphrases
</>
stochastic gradient descent
least squares
loss function
matrix factorization
step size
random forests
support vector machine
multiple kernel learning
weight vector
machine learning
decision trees
regularization parameter
importance sampling
feature extraction
logistic regression
online algorithms
feature selection