Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning.

Published in: NeurIPS (2021)

Keyphrases