Near-Optimal Reinforcement Learning with Self-Play.
Yu BaiChi JinTiancheng YuPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- function approximation
- learning process
- learning algorithm
- optimal policy
- state space
- provably near optimal
- reinforcement learning algorithms
- temporal difference
- markov decision processes
- machine learning
- multi agent
- model free
- multi agent reinforcement learning
- dynamic programming
- transfer learning
- multiscale
- real world
- databases
- autonomous learning
- policy search
- robotic control
- real time