Login / Signup
Data-Efficient Policy Evaluation Through Behavior Policy Search.
Josiah P. Hanna
Philip S. Thomas
Peter Stone
Scott Niekum
Published in:
CoRR (2017)
Keyphrases
</>
machine learning
training data
probability distribution
random walk
policy evaluation