Login / Signup
Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning.
Ziming Li
Julia Kiseleva
Maarten de Rijke
Published in:
AAAI (2019)
Keyphrases
</>
imitation learning
inverse reinforcement learning
reinforcement learning
preference elicitation
reward function
maximum margin
humanoid robot
partially observable
learning algorithm
random variables
markov decision processes
robotic systems
function approximation
temporal difference