Off-policy Policy Evaluation For Sequential Decisions Under Unobserved Confounding.
Hongseok NamkoongRamtin KeramatiSteve YadlowskyEmma BrunskillPublished in: NeurIPS (2020)
Keyphrases
- policy evaluation
- least squares
- reinforcement learning
- temporal difference
- model free
- monte carlo
- markov decision processes
- matrix inversion
- policy iteration
- decision making
- function approximation
- decision makers
- semi parametric
- decision process
- optimal policy
- variance reduction
- decision problems
- regression problems
- statistical inference
- feature selection