Login / Signup
Evaluating the Robustness of Off-Policy Evaluation.
Yuta Saito
Takuma Udagawa
Haruka Kiyohara
Kazuki Mogi
Yusuke Narita
Kei Tateno
Published in:
CoRR (2021)
Keyphrases
</>
policy evaluation
least squares
temporal difference
monte carlo
reinforcement learning
model free
matrix inversion
markov decision processes
semi parametric
policy iteration
variance reduction
function approximation
optimal solution
state space
learning tasks