Deep Reinforcement Learning of Dialogue Policies with Less Weight Updates.
Heriberto CuayáhuitlSeunghak YuPublished in: INTERSPEECH (2017)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- control policies
- function approximation
- fitted q iteration
- reward function
- markov decision processes
- mixed initiative
- reinforcement learning agents
- hierarchical reinforcement learning
- markov decision problems
- dialogue system
- continuous state
- state space
- policy gradient methods
- partially observable markov decision processes
- finite state
- human machine
- control policy
- reinforcement learning algorithms
- learning algorithm
- dynamic programming
- dialogue management
- sufficient conditions
- model free
- decision problems
- speech acts
- multi agent
- learning process
- optimal control
- learning classifier systems
- spoken dialogue systems
- natural language
- human computer
- machine learning
- tabula rasa
- natural language dialogue
- state abstraction
- partially observable
- conversational agent
- average reward
- infinite horizon
- interactive systems
- function approximators
- action space
- action selection
- weight vector