Login / Signup
Control Variates for Slate Off-Policy Evaluation.
Nikos Vlassis
Ashok Chandrashekar
Fernando Amat Gil
Nathan Kallus
Published in:
NeurIPS (2021)
Keyphrases
</>
policy evaluation
control system
least squares
temporal difference
model free
monte carlo
variance reduction
reinforcement learning
artificial neural networks
optimal control
semi parametric