Login / Signup
Adaptive Estimator Selection for Off-Policy Evaluation.
Yi Su
Pavithra Srinath
Akshay Krishnamurthy
Published in:
ICML (2020)
Keyphrases
</>
policy evaluation
least squares
variance reduction
model free
reinforcement learning
temporal difference
policy iteration
markov decision processes
machine learning
monte carlo
linear regression
semi parametric
graphical models
maximum likelihood
function approximation