On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems.
Pei-Hao SuMilica GasicNikola MrksicLina Maria Rojas-BarahonaStefan UltesDavid VandykeTsung-Hsien WenSteve J. YoungPublished in: ACL (1) (2016)
Keyphrases
- reinforcement learning
- learning algorithm
- learning tasks
- knowledge acquisition
- learning process
- domain specific
- spoken dialogue systems
- partially observable environments
- expert systems
- social networks
- machine learning
- optimal policy
- domain independent
- artificial intelligence
- human users
- dialogue system
- dialogue management