Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling.
Tengyang XieYifei MaYu-Xiang WangPublished in: CoRR (2019)
Keyphrases
- policy evaluation
- importance sampling
- monte carlo
- variance reduction
- reinforcement learning
- temporal difference
- markov chain
- least squares
- model free
- function approximation
- particle filter
- state space
- dynamic programming
- markov decision processes
- kalman filter
- markov chain monte carlo
- machine learning
- policy iteration
- reinforcement learning algorithms
- optimal control
- particle filtering
- closed form
- optimal policy
- approximate inference
- posterior distribution
- video sequences
- sample size
- optimal solution