Off-policy learning in large-scale POMDP-based dialogue systems.

Lucie Daubigney Matthieu Geist Olivier Pietquin

Published in: ICASSP (2012)

Keyphrases

dialogue system
reinforcement learning
learning algorithm
knowledge acquisition
ground truth
human computer
continuous state
predictive state representations