Login / Signup
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes.
Chengchun Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
Published in:
ICML (2022)
Keyphrases
</>
reinforcement learning
learning algorithm
learning tasks
policy evaluation
least squares
supervised learning
partially observable
partially observable markov decision processes
computational complexity
function approximation