Off-Policy Evaluation with Out-of-Sample Guarantees.
Sofia EkDave ZachariahFredrik D. JohanssonPeter StoicaPublished in: Trans. Mach. Learn. Res. (2023)
Keyphrases
- policy evaluation
- least squares
- monte carlo
- temporal difference
- model free
- reinforcement learning
- policy iteration
- markov decision processes
- variance reduction
- function approximation
- matrix inversion
- semi parametric
- optimal policy
- statistical inference
- partially observable markov decision processes
- reinforcement learning algorithms
- linear regression
- markov decision problems
- state space
- gaussian process
- density estimation
- sufficient conditions