Login / Signup

Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path.

András AntosCsaba SzepesváriRémi Munos
Published in: COLT (2006)
Keyphrases