Login / Signup
Wasserstein Distributionally Robust Policy Evaluation and Learning for Contextual Bandits.
Yi Shen
Pan Xu
Michael M. Zavlanos
Published in:
Trans. Mach. Learn. Res. (2024)
Keyphrases
</>
reinforcement learning
model free
learning algorithm
least squares
decision making
optimal solution
active learning
upper bound