Login / Signup
Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency.
Masatoshi Uehara
Masaaki Imaizumi
Nan Jiang
Nathan Kallus
Wen Sun
Tengyang Xie
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
finite sample
data sets
learning algorithm
decision trees
objective function
distance measure
error bounds