Fast Learning in an Actor-Critic Architecture with Reward and Punishment.

Christian Balkenius Stefan Winberg

Published in: SCAI (2008)

Keyphrases

reinforcement learning
actor critic
policy gradient
learning process
neural network
active learning
temporal difference
learning capabilities
temporal difference learning