Login / Signup
Offline Policy Evaluation with Out-of-Sample Guarantees.
Sofia Ek
Dave Zachariah
Published in:
CoRR (2023)
Keyphrases
</>
policy evaluation
least squares
monte carlo
model free
reinforcement learning
temporal difference
markov decision processes
policy iteration
function approximation
variance reduction
matrix inversion
semi parametric
optimal policy
linear regression
learning algorithm
reinforcement learning algorithms
upper bound