Login / Signup
Average-Reward Learning and Planning with Options.
Yi Wan
Abhishek Naik
Richard S. Sutton
Published in:
NeurIPS (2021)
Keyphrases
</>
reinforcement learning
learning algorithm
average reward
domain independent
markov decision processes
supervised learning
decision theoretic
optimality criterion