Login / Signup
A Minimax Learning Approach to Off-Policy Evaluation in Partially Observable Markov Decision Processes.
Chengchun Shi
Masatoshi Uehara
Nan Jiang
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
machine learning
learning algorithm
monte carlo
decision problems
policy evaluation
supervised learning
optimal policy
learning tasks
function approximation
partially observable markov decision processes
computational complexity