Login / Signup
Distributionally Robust Policy Evaluation under General Covariate Shift in Contextual Bandits.
Yihong Guo
Hao Liu
Yisong Yue
Anqi Liu
Published in:
CoRR (2024)
Keyphrases
</>
policy evaluation
covariate shift
reinforcement learning
data points
least squares
expectation maximization
model selection
unsupervised learning
linear regression
temporal difference