WD3: Taming the Estimation Bias in Deep Reinforcement Learning.
Qiang HeXinwen HouPublished in: ICTAI (2020)
Keyphrases
- reinforcement learning
- learning algorithm
- function approximation
- estimation accuracy
- density estimation
- reinforcement learning algorithms
- temporal difference
- machine learning
- neural network
- trade off
- learning process
- estimation algorithm
- maximum likelihood estimation
- model free
- robotic control
- accurate estimation
- monte carlo simulation
- optimal control
- objective function
- case study