Publication: Fast Stochastic Policy Gradient: Negative Momentum for Reinforcement Learning.