Login / Signup

Policy evaluation from a single path: Multi-step methods, mixing and mis-specification.

Yaqi DuanMartin J. Wainwright
Published in: CoRR (2022)
Keyphrases
  • multi step
  • lower bound
  • training data
  • supervised learning