Login / Signup
Transfer Reward Learning for Policy Gradient-Based Text Generation.
James O'Neill
Danushka Bollegala
Published in:
CoRR (2019)
Keyphrases
</>
partially observable environments
inverse reinforcement learning
reinforcement learning
learning process
online learning
learning systems
learning algorithm
supervised learning
learning tasks
knowledge transfer
dynamic programming
natural language processing
state action
policy gradient