WD3: Taming the Estimation Bias in Deep Reinforcement Learning.

Qiang He Xinwen Hou

Published in: ICTAI (2020)

Keyphrases

reinforcement learning
learning algorithm
function approximation
estimation accuracy
density estimation
reinforcement learning algorithms
temporal difference
machine learning
neural network
trade off
learning process
estimation algorithm
maximum likelihood estimation
model free
robotic control
accurate estimation
monte carlo simulation
optimal control
objective function
case study