Login / Signup
Δ-OPE: Off-Policy Estimation with Pairs of Policies.
Olivier Jeunen
Aleksei Ustimenko
Published in:
CoRR (2024)
Keyphrases
</>
search algorithm
estimation algorithm
artificial intelligence
pairwise
optimal policy
learning algorithm
accurate estimation