Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning.
Wenjie ShiShiji SongHui WuYa-Chu HsuCheng WuGao HuangPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- state space
- reinforcement learning algorithms
- least squares
- multi agent
- function approximation
- model free
- learning algorithm
- multi agent reinforcement learning
- optimal policy
- machine learning
- supervised learning
- markov decision processes
- robotic control
- temporal difference learning
- control problems
- learning process
- action selection
- temporal difference
- data sets
- transfer learning
- function approximators
- dynamic programming
- policy search
- relational reinforcement learning
- data mining