Login / Signup
Online Bellman Residual and Temporal Difference Algorithms with Predictive Error Guarantees.
Wen Sun
J. Andrew Bagnell
Published in:
IJCAI (2016)
Keyphrases
</>
policy iteration
temporal difference
policy evaluation
monte carlo
learning algorithm
model free
sample path
reinforcement learning
least squares
markov decision processes
neural network
support vector
evolutionary algorithm
hybrid algorithms
td learning