Sign in

Distributionally Robust Policy Evaluation under General Covariate Shift in Contextual Bandits.

Yihong GuoHao LiuYisong YueAnqi Liu
Published in: CoRR (2024)
Keyphrases
  • policy evaluation
  • covariate shift
  • reinforcement learning
  • data points
  • least squares
  • expectation maximization
  • model selection
  • unsupervised learning
  • linear regression
  • temporal difference