Login / Signup
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning.
Philip S. Thomas
Emma Brunskill
Published in:
ICML (2016)
Keyphrases
</>
reinforcement learning
temporal difference
machine learning
training data
bayesian networks
least squares
supervised learning
high dimensional data
function approximation
policy evaluation