Publication: Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss.