Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling.
Yao LiuPierre-Luc BaconEmma BrunskillPublished in: CoRR (2019)
Keyphrases
- importance sampling
- policy evaluation
- monte carlo
- variance reduction
- markov chain
- temporal difference
- least squares
- particle filter
- kalman filter
- particle filtering
- approximate inference
- high dimensional
- model free
- markov chain monte carlo
- state space
- high dimensional data
- computer vision
- prior knowledge
- policy iteration
- semi parametric
- reinforcement learning