Login / Signup
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems.
Yihao Feng
Shentao Yang
Shujian Zhang
Jianguo Zhang
Caiming Xiong
Mingyuan Zhou
Huan Wang
Published in:
ICLR (2023)
Keyphrases
</>
dialogue system
reinforcement learning
tutorial dialogue
human computer
learning process
learning algorithm
bandit problems
hidden markov models
mixed initiative