Login / Signup

Mapping Language to Programs using Multiple Reward Components with Inverse Reinforcement Learning.

Sayan GhoshShashank Srivastava
Published in: EMNLP (Findings) (2021)
Keyphrases
  • inverse reinforcement learning
  • partially observable environments
  • bayesian nonparametric
  • reward function
  • reinforcement learning
  • preference elicitation
  • artificial intelligence
  • temporal difference