Publication: Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs.