Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation.
Zhaohan GuoPhilip S. ThomasEmma BrunskillPublished in: NIPS (2017)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- reinforcement learning
- matrix inversion
- model free
- function approximation
- policy iteration
- monte carlo
- variance reduction
- markov decision processes
- optimal policy
- statistical inference
- test set
- partially observable markov decision processes
- semi parametric
- cost function
- neural network