Publication: Estimation and adaptive control of span-contracting Markov decision processes.