Mastering "Gongzhu" with Self-play Deep Reinforcement Learning.

Licheng Wu Qifei Wu Hongming Zhong Xiali Li

Published in: ICCSIP (2022)

Keyphrases

reinforcement learning
function approximation
reinforcement learning algorithms
state space
temporal difference
real time
temporal difference learning
case study
model free
game playing
optimal policy
robotic control
machine learning
markov decision processes
supervised learning
multi agent
robot control
function approximators
learning algorithm
multi agent reinforcement learning
dynamic programming
search space
deep learning
hands on guide