ConQUR: Mitigating Delusional Bias in Deep Q-Learning.

Dijia Su Jayden Ooi Tyler Lu Dale Schuurmans Craig Boutilier

Published in: ICML (2020)

Keyphrases

reinforcement learning
cooperative
multi agent
function approximation
state space
learning algorithm
learning rate
reinforcement learning algorithms
multi agent reinforcement learning
temporal difference learning
trade off
variance reduction
model free
optimal policy
bucket brigade
learning agent
action selection
monte carlo
real time
least squares
expert systems
case study
genetic algorithm