GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems.
Youngsoo JangJongmin LeeKee-Eung KimPublished in: ICLR (2022)
Keyphrases
- end to end
- dialogue system
- reinforcement learning
- actor critic
- temporal difference
- function approximation
- reinforcement learning algorithms
- dialogue management
- tutorial dialogue
- policy gradient
- natural language
- spoken dialogue systems
- mixed initiative
- wireless ad hoc networks
- model free
- real time
- state space
- admission control
- congestion control
- machine learning
- human users
- optimal control
- user model
- application layer
- multi agent
- dialogue games