Publication: Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming.