Publication: Understanding Reinforcement Learning Algorithms: The Progress from Basic Q-learning to Proximal Policy Optimization.