Login / Signup

On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly-Communicating MDPs.

Yi WanRichard S. Sutton
Published in: CoRR (2022)
Keyphrases