Sign in
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting.
Ilja Kuzborskij
Claire Vernade
András György
Csaba Szepesvári
Published in:
CoRR (2020)
Keyphrases
</>
policy evaluation
least squares
temporal difference
monte carlo
model free
function approximation
matrix inversion
machine learning
reinforcement learning
mixture model
markov decision processes
statistical inference
semi parametric