Publication: Near Optimal Provable Uniform Convergence in Off-Policy Evaluation for Reinforcement Learning.