Domain-Independent User Satisfaction Reward Estimation for Dialogue Policy Learning.
Stefan UltesPawel BudzianowskiIñigo CasanuevaNikola MrksicLina Maria Rojas-BarahonaPei-Hao SuTsung-Hsien WenMilica GasicSteve J. YoungPublished in: INTERSPEECH (2017)
Keyphrases
- domain independent
- domain specific
- user satisfaction
- domain dependent
- hand coded
- control knowledge
- reinforcement learning
- learning process
- state space search
- macro operators
- planning problems
- knowledge acquisition
- domain specific knowledge
- natural language
- databases
- policy gradient
- partially observable environments