Publication: Improving Actor-Critic Reinforcement Learning via Hamiltonian Policy.