Publication: Dynamics of Softmax Q-Learning in Two-Player Two-Action Games