Login / Signup
Learning online alignments with continuous rewards policy gradient.
Yuping Luo
Chung-Cheng Chiu
Navdeep Jaitly
Ilya Sutskever
Published in:
ICASSP (2017)
Keyphrases
</>
reinforcement learning
learning algorithm
policy gradient
learning tasks
learning process
cost function
decision problems
learning problems
function approximation