Login / Signup
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning.
Philip S. Thomas
Emma Brunskill
Published in:
CoRR (2016)
Keyphrases
</>
reinforcement learning
function approximation
temporal difference
policy evaluation
machine learning
learning algorithm
feature selection
training data
least squares