Reward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning.
Stefan UltesPawel BudzianowskiIñigo CasanuevaNikola MrksicLina Maria Rojas-BarahonaPei-Hao SuTsung-Hsien WenMilica GasicSteve J. YoungPublished in: CoRR (2017)
Keyphrases
- reinforcement learning
- multi objective
- spoken dialogue systems
- evolutionary algorithm
- dialogue system
- human machine interaction
- multi objective optimization
- dialogue management
- machine learning
- learning algorithm
- multi domain
- model free
- state space
- objective function
- average reward
- reward function
- genetic algorithm
- learning agent
- multi agent
- human users
- temporal difference
- activity recognition
- domain independent
- reinforcement learning algorithms
- optimal policy
- context aware
- general purpose
- ground truth
- policy gradient
- eligibility traces