-Bounds for Policy Evaluation in Tabular Reinforcement Learning.
Ashwin PananjadyMartin J. WainwrightPublished in: IEEE Trans. Inf. Theory (2021)
Keyphrases
- policy evaluation
- reinforcement learning
- variance reduction
- temporal difference
- least squares
- model free
- monte carlo
- function approximation
- markov decision processes
- td learning
- policy iteration
- sample size
- semi parametric
- upper bound
- lower bound
- reinforcement learning algorithms
- optimal policy
- worst case
- state space
- supervised learning
- learning algorithm
- decision making
- step size
- importance sampling
- multi agent
- machine learning
- statistical learning
- optimal control
- np hard
- evaluation function
- decision problems
- dynamic programming