Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning.

Gen Li Changxiao Cai Yuxin Chen Yuantao Gu Yuting Wei Yuejie Chi

Published in: CoRR (2021)

Keyphrases

reinforcement learning
function approximation
multi agent
cooperative
state space
action selection
model free
reinforcement learning algorithms
stochastic approximation
learning algorithm
potential field
real time
optimal policy
information retrieval
data mining
reinforcement learning methods
temporal difference learning
dependence structure
mobile robot
dynamic programming
multi agent systems
case study
decision making
genetic algorithm