C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning.
Philip S. Thomas
Emma Brunskill
Published in:
CoRR (2016)
Keyphrases
</>
reinforcement learning
function approximation
temporal difference
policy evaluation
machine learning
learning algorithm
feature selection
training data
least squares