Login / Signup
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies.
Nathan Kallus
Masatoshi Uehara
Published in:
NeurIPS (2020)
Keyphrases
</>
gradient estimation
variance reduction