Publication: Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity.