Online Bellman Residual and Temporal Difference Algorithms with Predictive Error Guarantees.

Wen Sun J. Andrew Bagnell

Published in: IJCAI (2016)

Keyphrases

policy iteration
temporal difference
policy evaluation
monte carlo
learning algorithm
model free
sample path
reinforcement learning
least squares
markov decision processes
neural network
support vector
evolutionary algorithm
hybrid algorithms
td learning