Importance Sampling Policy Evaluation with an Estimated Behavior Policy.
Josiah HannaScott NiekumPeter StonePublished in: ICML (2019)
Keyphrases
- policy evaluation
- importance sampling
- monte carlo
- variance reduction
- temporal difference
- least squares
- markov chain
- reinforcement learning
- kalman filter
- particle filter
- policy iteration
- markov chain monte carlo
- model free
- optimal policy
- markov decision processes
- function approximation
- semi parametric
- sample size
- particle filtering
- approximate inference
- probability distribution
- pairwise
- policy gradient
- learning algorithm