Publication: On-Policy Deep Reinforcement Learning for the Average-Reward Criterion.