Login / Signup
On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems.
Pei-Hao Su
Milica Gasic
Nikola Mrksic
Lina Maria Rojas-Barahona
Stefan Ultes
David Vandyke
Tsung-Hsien Wen
Steve J. Young
Published in:
CoRR (2016)
Keyphrases
</>
learning algorithm
learning tasks
reinforcement learning
partially observable environments
inverse reinforcement learning
semi supervised
knowledge management