Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management.
Dhawal GuptaYinlam ChowMohammad GhavamzadehCraig BoutilierPublished in: CoRR (2023)
Keyphrases
- dialogue management
- reinforcement learning
- learning agent
- dialogue system
- spoken dialogue systems
- natural language generation
- partially observable markov decision process
- language understanding
- state space
- function approximation
- partially observable
- human experts
- optimal policy
- markov decision processes
- reinforcement learning algorithms
- natural language
- model free
- optimal control
- domain experts
- supervised learning
- domain knowledge
- multi agent
- temporal difference
- transfer learning
- dynamic environments
- hidden markov models
- learning algorithm