Combinatorial Q-Learning for Dou Di Zhu.
Yang YouLiangwei LiBaisong GuoWeiming WangCewu LuPublished in: AIIDE (2020)
Keyphrases
- reinforcement learning
- multi agent
- cooperative
- function approximation
- state space
- stochastic approximation
- learning algorithm
- action selection
- optimal policy
- learning rate
- bucket brigade
- ieee trans
- reinforcement learning algorithms
- state action
- temporal difference learning
- multi agent reinforcement learning
- reinforcement learning methods
- database
- real world
- databases