Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation.
Nikolay BogoychevKenneth HeafieldAlham Fikri AjiMarcin Junczys-DowmuntPublished in: EMNLP (2018)
Keyphrases
- machine translation
- stochastic gradient descent
- least squares
- loss function
- step size
- matrix factorization
- random forests
- natural language processing
- information extraction
- cross lingual
- regularization parameter
- cross language information retrieval
- natural language
- statistical machine translation
- weight vector
- support vector machine
- multiple kernel learning
- machine translation system
- target language
- online algorithms
- feature extraction
- importance sampling
- active learning