Policy optimisation of POMDP-based dialogue systems without state space compression.
Milica GasicMatthew HendersonBlaise ThomsonPirros TsiakoulisSteve J. YoungPublished in: SLT (2012)
Keyphrases
- state space
- dialogue system
- partially observable markov decision process
- optimal policy
- dialogue management
- markov decision process
- partially observable
- partially observable markov decision processes
- markov decision problems
- reinforcement learning
- state and action spaces
- belief state
- markov decision processes
- reward function
- continuous state spaces
- natural language
- action space
- heuristic search
- mixed initiative
- dynamical systems
- spoken dialogue systems
- markov chain
- dynamic programming
- tutorial dialogue
- human users
- model free reinforcement learning
- infinite horizon
- long run
- finite state
- planning problems
- continuous state
- initial state
- decision theoretic
- search space
- partial observability
- dialogue games
- belief space
- control policies
- state dependent
- reinforcement learning algorithms
- state variables
- decision problems
- multi agent
- learning algorithm
- point based value iteration
- machine learning