DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization.
Aviral KumarRishabh AgarwalTengyu MaAaron C. CourvilleGeorge TuckerSergey LevinePublished in: ICLR (2022)
Keyphrases
- reinforcement learning
- learning algorithm
- temporal difference
- function approximation
- reinforcement learning algorithms
- regularization method
- data sets
- machine learning
- learning process
- learning problems
- learning classifier systems
- data dependent
- model free
- parameter selection
- temporal difference learning
- regularization methods