PGQ: Combining policy gradient and Q-learning.
Brendan O'DonoghueRémi MunosKoray KavukcuogluVolodymyr MnihPublished in: CoRR (2016)
Keyphrases
- policy gradient
- reinforcement learning
- function approximation
- actor critic
- reinforcement learning algorithms
- model free reinforcement learning
- state action
- gradient method
- state space
- learning algorithm
- policy search
- cooperative
- reinforcement learning methods
- single agent
- approximation methods
- markov decision processes
- model free
- learning rate
- optimal control
- function approximators
- variance reduction
- temporal difference learning
- learning tasks