End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient.

Li Zhou Kevin Small Oleg Rokhlenko Charles Elkan

Published in: CoRR (2017)

Keyphrases

end to end
policy gradient
goal oriented
learning algorithm
model free reinforcement learning
actor critic
admission control
learning process
learning problems
policy gradient methods
learning tasks
reinforcement learning
adaptive control
function approximation
gradient method
state action
evaluation function
domain independent
supervised learning
multi agent systems