Login / Signup

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path.

András AntosCsaba SzepesváriRémi Munos
Published in: Mach. Learn. (2008)
Keyphrases