Publication: Deterministic MDPs with Adversarial Rewards and Bandit Feedback.