Publication: Reinforcement learning with nonstationary reward depending on the episode.