C
search
search
reviewers
reviewers
feeds
feeds
assignments
assignments
settings
logout
Minimax Weight and Q-Function Learning for Off-Policy Evaluation.
Masatoshi Uehara
Jiawei Huang
Nan Jiang
Published in:
ICML (2020)
Keyphrases
</>
learning algorithm
reinforcement learning
supervised learning
td learning
artificial neural networks
active learning
least squares
utility function
learning tasks
evaluation function
temporal difference