Importance Sampling Policy Evaluation with an Estimated Behavior Policy.
Josiah HannaScott NiekumPeter StonePublished in: CoRR (2018)
Keyphrases
- policy evaluation
- importance sampling
- monte carlo
- variance reduction
- markov chain
- temporal difference
- least squares
- markov decision processes
- policy iteration
- model free
- particle filtering
- function approximation
- optimal policy
- particle filter
- reinforcement learning
- posterior distribution
- kalman filter
- image segmentation