Theoretical and Experimental Comparison of Off-Policy Evaluation from Dependent Samples.
Masahiro KatoPublished in: CoRR (2020)
Keyphrases
- experimental comparison
- policy evaluation
- least squares
- temporal difference
- monte carlo
- model free
- reinforcement learning
- markov decision processes
- matrix inversion
- policy iteration
- feature selection
- function approximation
- variance reduction
- semi parametric
- sufficient conditions
- dynamic programming
- reinforcement learning algorithms
- training set
- markov decision problems