Approximate dynamic programming using model-free Bellman Residual Elimination.

Brett Bethke Jonathan P. How

Published in: ACC (2010)

Keyphrases

policy iteration
approximate dynamic programming
model free
reinforcement learning
function approximation
policy evaluation
temporal difference
reinforcement learning algorithms
average reward
markov decision processes
markov decision problems
machine learning
learning algorithm
markov chain