Experiments with Infinite-Horizon, Policy-Gradient Estimation.

Jonathan Baxter Peter L. Bartlett Lex Weaver

Published in: J. Artif. Intell. Res. (2001)

Keyphrases