Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU.
Mohammad BabaeizadehIuri FrosioStephen TyreeJason ClemonsJan KautzPublished in: ICLR (Poster) (2017)
Keyphrases
- actor critic
- reinforcement learning
- policy gradient
- reinforcement learning algorithms
- temporal difference
- approximate dynamic programming
- optimal control
- function approximation
- neuro fuzzy
- gradient method
- policy iteration
- control problems
- optimal policy
- model free
- markov decision processes
- dynamic programming
- multi agent
- natural actor critic
- real time
- rl algorithms
- recursive least squares
- policy gradient methods
- stochastic games
- average reward
- partially observable markov decision processes
- learning problems
- linear program
- supervised learning
- learning algorithm
- machine learning