Login / Signup
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback.
Zhirui Chen
Vincent Y. F. Tan
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
real time
dynamic programming
machine learning
worst case
markov decision processes
optimal control
soft constraints
tight bounds