Login / Signup
Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy.
Kyungbok Lee
Myunghee Cho Paik
Published in:
CoRR (2024)
Keyphrases
</>
policy evaluation
least squares
reinforcement learning
monte carlo
markov decision processes
policy iteration
optimal policy
model free
function approximation
temporal difference
variance reduction
dynamic programming
neural network
cost function
supervised learning
model selection
semi parametric