Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning.
Gen LiChangxiao CaiYuxin ChenYuantao GuYuting WeiYuejie ChiPublished in: ICML (2021)
Keyphrases
- reinforcement learning
- cooperative
- function approximation
- multi agent
- learning algorithm
- state space
- stochastic approximation
- model free
- bucket brigade
- action selection
- reinforcement learning algorithms
- multi agent reinforcement learning
- monte carlo
- optimal policy
- temporal difference learning
- information retrieval
- potential field
- neural network