On the Reduction of Variance and Overestimation of Deep Q-Learning.

Mohammed Sabry Amr M. A. Khalifa

Published in: CoRR (2019)

Keyphrases

cooperative
reinforcement learning
multi agent
function approximation
state space
learning algorithm
model free
trade off
data mining
standard deviation
multi agent reinforcement learning
stochastic approximation
action selection
prediction error
correlation coefficient
covariance matrix
optimal policy
dynamic programming