Login / Signup
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting.
Ilja Kuzborskij
Claire Vernade
András György
Csaba Szepesvári
Published in:
AISTATS (2021)
Keyphrases
</>
policy evaluation
least squares
monte carlo
model free
function approximation
temporal difference
policy iteration
variance reduction
reinforcement learning
markov chain
fixed point
statistical inference
semi parametric