The Mean-Squared Error of Double Q-Learning.

Wentao Weng Harsh Gupta Niao He Lei Ying R. Srikant

Published in: NeurIPS (2020)

Keyphrases

reinforcement learning
function approximation
multi agent
cooperative
state space
stochastic approximation
learning algorithm
reinforcement learning algorithms
action selection
learning rate
multi agent reinforcement learning
optimal policy
reinforcement learning methods
quantization error
model free
genetic algorithm
temporal difference learning
credit assignment
multiagent learning
real time
distortion measure
prediction error
search algorithm
case study
decision making
artificial intelligence
information retrieval