GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems.

Youngsoo Jang Jongmin Lee Kee-Eung Kim

Published in: ICLR (2022)

Keyphrases

end to end
dialogue system
reinforcement learning
actor critic
temporal difference
function approximation
reinforcement learning algorithms
dialogue management
tutorial dialogue
policy gradient
natural language
spoken dialogue systems
mixed initiative
wireless ad hoc networks
model free
real time
state space
admission control
congestion control
machine learning
human users
optimal control
user model
application layer
multi agent
dialogue games