Handling Confounding for Realistic Off-Policy Evaluation.
Saurabh SohoneyNikita PrabhuVineet ChaojiPublished in: WWW (Companion Volume) (2018)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- reinforcement learning
- monte carlo
- model free
- markov decision processes
- matrix inversion
- policy iteration
- function approximation
- variance reduction
- semi parametric
- optimal policy
- linear regression
- statistical inference
- linear model
- partially observable markov decision processes
- evaluation function
- sufficient conditions