Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning.

R. Srikant Lei Ying

Published in: CoRR (2019)

Keyphrases

error bounds
stochastic approximation
td learning
temporal difference
monte carlo
policy iteration
theoretical analysis
temporal difference learning
evaluation function
worst case
function approximation
policy evaluation
model free
reinforcement learning
neural network
reinforcement learning algorithms
linear combination
function approximators
theoretical guarantees
sufficient conditions
training data
learning algorithm