Diff-DAC: Distributed Actor-Critic for Multitask Deep Reinforcement Learning.
Sergio Valcarcel MacuaAleksi TukiainenDaniel García-Ocaña HernándezDavid BaldazoEnrique Munoz de CoteSantiago ZazoPublished in: CoRR (2017)
Keyphrases
- actor critic
- reinforcement learning
- multi task
- policy gradient
- optimal control
- temporal difference
- reinforcement learning algorithms
- function approximation
- approximate dynamic programming
- transfer learning
- multitask learning
- multi agent
- learning problems
- learning tasks
- neuro fuzzy
- gradient method
- policy iteration
- model free
- multi class
- learning algorithm
- evaluation function
- state space
- policy gradient methods
- markov decision processes
- dynamic programming
- learning process
- average reward
- partially observable
- variance reduction
- temporal difference learning
- learning experience
- decision trees