Recurrent policy gradients.

Daan Wierstra Alexander Förster Jan Peters Jürgen Schmidhuber

Published in: Log. J. IGPL (2010)

Keyphrases

optimal policy
real world
asymptotically optimal
real time
information retrieval
artificial intelligence
information systems
knowledge base
steady state
feed forward
policy making