Login / Signup
Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation.
Nathan Kallus
Masatoshi Uehara
Published in:
ICML (2020)
Keyphrases
</>
reinforcement learning
policy evaluation
temporal difference
least squares
function approximation
model free
monte carlo
semi parametric
optimal policy
td learning
neural network
markov decision processes
multi agent
policy iteration
markov chain
reinforcement learning methods
computational complexity