DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning.
Daochen ZhaJingru XieWenye MaSheng ZhangXiangru LianXia HuJi LiuPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- markov decision processes
- optimal policy
- multi agent
- robotic control
- reinforcement learning algorithms
- state space
- learning algorithm
- temporal difference learning
- game playing
- search algorithm
- decision trees
- supervised learning
- dynamic programming
- transfer learning
- artificial neural networks
- database systems
- model free
- decision making
- coverage includes
- database