Login / Signup
Natural actor and belief critic: Reinforcement algorithm for learning parameters of dialogue systems modelled as POMDPs.
Filip Jurcícek
Blaise Thomson
Steve J. Young
Published in:
ACM Trans. Speech Lang. Process. (2011)
Keyphrases
</>
learning algorithm
reinforcement learning
expectation maximization
dynamic programming
actor critic
dialogue system
model free
policy gradient
multi agent
knowledge acquisition
intelligent tutoring systems
function approximation
reinforcement learning algorithms