ConQUR: Mitigating Delusional Bias in Deep Q-Learning.
Dijia SuJayden OoiTyler LuDale SchuurmansCraig BoutilierPublished in: ICML (2020)
Keyphrases
- reinforcement learning
- cooperative
- multi agent
- function approximation
- state space
- learning algorithm
- learning rate
- reinforcement learning algorithms
- multi agent reinforcement learning
- temporal difference learning
- trade off
- variance reduction
- model free
- optimal policy
- bucket brigade
- learning agent
- action selection
- monte carlo
- real time
- least squares
- expert systems
- case study
- genetic algorithm