Publication: Reinforcement Learning Upside Down: Don't Predict Rewards - Just Map Them to Actions.