Login / Signup
Balanced Off-Policy Evaluation in General Action Spaces.
Arjun Sondhi
David Arbour
Drew Dimmery
Published in:
CoRR (2019)
Keyphrases
</>
action space
reinforcement learning
markov decision processes
policy evaluation
state space
temporal difference
machine learning
search algorithm
monte carlo
real valued