On the Estimation Bias in Double Q-Learning.

Zhizhou Ren Guangxiang Zhu Hao Hu Beining Han Jianglun Chen Chongjie Zhang

Published in: NeurIPS (2021)

Keyphrases

reinforcement learning
state space
function approximation
multi agent
cooperative
learning algorithm
accurate estimation
parameter estimation
trade off
case study
optimal policy
decision making
monte carlo simulation
estimation algorithm
reinforcement learning algorithms
neural network
stochastic approximation