Characterizing the Exact Behaviors of Temporal Difference Learning Algorithms Using Markov Jump Linear System Theory.

Bin Hu Usman Ahmed Syed

Published in: NeurIPS (2019)

Keyphrases

temporal difference learning algorithms
function approximation
markov chain
approximation error
temporal difference learning
reinforcement learning
asymptotic properties
temporal difference