Consistency of HDP applied to a simple reinforcement learning problem.

Published in: Neural Networks (1990)

Keyphrases

reinforcement learning
information systems
dynamic programming
learning algorithm
bayesian networks
multi agent
markov decision processes
function approximation