Near-Optimal Reinforcement Learning with Self-Play.
Yu BaiChi JinTiancheng YuPublished in: NeurIPS (2020)
Keyphrases
- reinforcement learning
- function approximation
- learning algorithm
- state space
- model free
- robotic control
- temporal difference learning
- search engine
- database
- optimal policy
- artificial neural networks
- function approximators
- reinforcement learning algorithms
- supervised learning
- multi agent
- decision making
- machine learning
- databases
- data sets