Average-Reward Learning and Planning with Options.

Yi Wan Abhishek Naik Richard S. Sutton

Published in: CoRR (2021)

Keyphrases

learning algorithm
average reward
reinforcement learning
random walk
stochastic games
knowledge base
supervised learning
optimal control
decision theoretic
long run
td learning