Login / Signup
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation.
Xiaohong Chen
Zhengling Qi
Published in:
ICML (2022)
Keyphrases
</>
policy evaluation
worst case
semi parametric
optimal solution
expected error
statistical inference
dynamic programming
reinforcement learning
function approximation
optimal control
model free
density estimation
variance reduction
parametric models
temporal difference
policy iteration
monte carlo