Login / Signup
Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning.
Christoph Dann
Teodor V. Marinov
Mehryar Mohri
Julian Zimmert
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
regret bounds
function approximators
function approximation
state space
optimal policy