Login / Signup
Finite-Sample Regret Bound for Distributionally Robust Offline Tabular Reinforcement Learning.
Zhengqing Zhou
Qinxun Bai
Zhengyuan Zhou
Linhai Qiu
Jose H. Blanchet
Peter W. Glynn
Published in:
AISTATS (2021)
Keyphrases
</>
finite sample
reinforcement learning
sample size
statistical learning theory
uniform convergence
nearest neighbor
online learning
mixture model
regret bounds