Login / Signup

Towards Robust Off-Policy Evaluation via Human Inputs.

Harvineet SinghShalmali JoshiFinale Doshi-VelezHimabindu Lakkaraju
Published in: CoRR (2022)
Keyphrases
  • policy evaluation
  • least squares
  • reinforcement learning
  • model free
  • temporal difference
  • variance reduction
  • machine learning
  • moving objects
  • matrix inversion