Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning.
Wenjie ShiShiji SongHui WuYa-Chu HsuCheng WuGao HuangPublished in: NeurIPS (2019)
Keyphrases
- reinforcement learning
- function approximation
- state space
- multi agent reinforcement learning
- reinforcement learning algorithms
- robotic control
- reinforcement learning methods
- least squares
- dynamic programming
- learning process
- total least squares
- neural network
- risk minimization
- temporal difference learning
- learning agents
- deep learning
- control problems
- temporal difference
- model free
- machine learning
- optimal policy
- objective function
- data mining
- learning agent
- real world
- markov decision processes
- real time