ConQUR: Mitigating Delusional Bias in Deep Q-learning.
Andy SuJayden OoiTyler LuDale SchuurmansCraig BoutilierPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- multi agent
- function approximation
- cooperative
- learning algorithm
- state space
- optimal policy
- model free
- stochastic approximation
- action selection
- reinforcement learning algorithms
- knowledge base
- real time
- multi agent reinforcement learning
- single agent
- deep learning
- continuous state and action spaces
- stochastic shortest path