• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Learning online alignments with continuous rewards policy gradient.

Yuping LuoChung-Cheng ChiuNavdeep JaitlyIlya Sutskever
Published in: ICASSP (2017)
Keyphrases
  • reinforcement learning
  • learning algorithm
  • policy gradient
  • learning tasks
  • learning process
  • cost function
  • decision problems
  • learning problems
  • function approximation