Error Bounds in Reinforcement Learning Policy Evaluation.
Fletcher LuPublished in: Canadian Conference on AI (2005)
Keyphrases
- error bounds
- policy evaluation
- reinforcement learning
- least squares
- temporal difference
- model free
- monte carlo
- policy iteration
- function approximation
- markov decision processes
- theoretical analysis
- td learning
- worst case
- variance reduction
- optimal policy
- reinforcement learning algorithms
- machine learning
- semi parametric
- state space
- gaussian process
- statistical inference
- action selection
- dynamic programming
- partially observable markov decision processes
- importance sampling
- reinforcement learning methods
- average reward
- transfer learning
- upper bound
- evaluation function