Login / Signup

Δ-OPE: Off-Policy Estimation with Pairs of Policies.

Olivier JeunenAleksei Ustimenko
Published in: CoRR (2024)
Keyphrases
  • search algorithm
  • estimation algorithm
  • artificial intelligence
  • pairwise
  • optimal policy
  • learning algorithm
  • accurate estimation