Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning.
Gen LiChangxiao CaiYuxin ChenYuantao GuYuting WeiYuejie ChiPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- function approximation
- multi agent
- cooperative
- state space
- action selection
- model free
- reinforcement learning algorithms
- stochastic approximation
- learning algorithm
- potential field
- real time
- optimal policy
- information retrieval
- data mining
- reinforcement learning methods
- temporal difference learning
- dependence structure
- mobile robot
- dynamic programming
- multi agent systems
- case study
- decision making
- genetic algorithm