Combinatorial Q-Learning for Dou Di Zhu.

Yang You Liangwei Li Baisong Guo Weiming Wang Cewu Lu

Published in: AIIDE (2020)

Keyphrases

reinforcement learning
multi agent
cooperative
function approximation
state space
stochastic approximation
learning algorithm
action selection
optimal policy
learning rate
bucket brigade
ieee trans
reinforcement learning algorithms
state action
temporal difference learning
multi agent reinforcement learning
reinforcement learning methods
database
real world
databases