Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback.

Khanh Nguyen Hal Daumé III Jordan L. Boyd-Graber

Published in: CoRR (2017)

Keyphrases

machine translation
reinforcement learning
natural language processing
language independent
cross lingual
information extraction
statistical machine translation
human subjects
sensory inputs
target language
cross language information retrieval
word sense disambiguation
language processing
natural language generation
natural language
chinese english
language resources
machine translation system
brazilian portuguese
word level
word alignment
query translation
state space
transfer learning
dynamic programming
machine learning
question answering
markov chain
source language
tasks in natural language processing