PGQ: Combining policy gradient and Q-learning.

Brendan O'Donoghue Rémi Munos Koray Kavukcuoglu Volodymyr Mnih

Published in: CoRR (2016)

Keyphrases

policy gradient
reinforcement learning
function approximation
actor critic
reinforcement learning algorithms
model free reinforcement learning
state action
gradient method
state space
learning algorithm
policy search
cooperative
reinforcement learning methods
single agent
approximation methods
markov decision processes
model free
learning rate
optimal control
function approximators
variance reduction
temporal difference learning
learning tasks