Login / Signup
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling.
Yuta Saito
Qingyang Ren
Thorsten Joachims
Published in:
CoRR (2023)
Keyphrases
</>
policy evaluation
action space
markov decision processes
reinforcement learning
least squares
state space
function approximation
policy iteration
model free
temporal difference
machine learning
evaluation function