Login / Signup
Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning.
Gang Wang
Georgios B. Giannakis
Published in:
AISTATS (2020)
Keyphrases
</>
stochastic approximation
error bounds
monte carlo
theoretical analysis
worst case
reinforcement learning
finite number
policy iteration
markov decision processes
machine learning
learning experience
temporal difference learning