Striving for Simplicity in Off-policy Deep Reinforcement Learning.
Rishabh AgarwalDale SchuurmansMohammad NorouziPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- function approximation
- temporal difference
- markov decision processes
- learning algorithm
- model free
- real time
- deep learning
- learning process
- state space
- optimal policy
- reinforcement learning methods
- reinforcement learning algorithms
- action selection
- database
- supervised learning
- multi agent
- monte carlo
- probabilistic model
- learning problems
- dynamic programming
- learning classifier systems
- search space
- case study
- social networks
- machine learning
- temporal difference learning
- neural network
- stochastic approximation
- robotic control