Mastering "Gongzhu" with Self-play Deep Reinforcement Learning.
Licheng WuQifei WuHongming ZhongXiali LiPublished in: ICCSIP (2022)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- temporal difference
- real time
- temporal difference learning
- case study
- model free
- game playing
- optimal policy
- robotic control
- machine learning
- markov decision processes
- supervised learning
- multi agent
- robot control
- function approximators
- learning algorithm
- multi agent reinforcement learning
- dynamic programming
- search space
- deep learning
- hands on guide