Regret Minimization Experience Replay in Off-Policy Reinforcement Learning.
Xu-Hui LiuZhenghai XueJing-Cheng PangShengyi JiangFeng XuYang YuPublished in: NeurIPS (2021)
Keyphrases
- regret minimization
- reinforcement learning
- nash equilibrium
- function approximation
- state space
- markov decision processes
- game theoretic
- robotic control
- multi agent
- reinforcement learning algorithms
- neural network
- game theory
- supervised learning
- computational complexity
- model free
- action selection
- action space
- learning algorithm