ConQUR: Mitigating Delusional Bias in Deep Q-learning.

Andy Su Jayden Ooi Tyler Lu Dale Schuurmans Craig Boutilier

Published in: CoRR (2020)

Keyphrases

reinforcement learning
multi agent
function approximation
cooperative
learning algorithm
state space
optimal policy
model free
stochastic approximation
action selection
reinforcement learning algorithms
knowledge base
real time
multi agent reinforcement learning
single agent
deep learning
continuous state and action spaces
stochastic shortest path