DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning.
Daochen ZhaJingru XieWenye MaSheng ZhangXiangru LianXia HuJi LiuPublished in: ICML (2021)
Keyphrases
- reinforcement learning
- function approximation
- machine learning
- model free
- state space
- reinforcement learning algorithms
- temporal difference
- data sets
- databases
- coverage includes
- temporal difference learning
- markov decision processes
- transfer learning
- learning algorithm
- dynamic programming
- real world
- robotic control
- evolutionary algorithm
- search space
- markov decision process
- information retrieval