Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning.

Gen Li Changxiao Cai Yuxin Chen Yuantao Gu Yuting Wei Yuejie Chi

Published in: ICML (2021)

Keyphrases

reinforcement learning
cooperative
function approximation
multi agent
learning algorithm
state space
stochastic approximation
model free
bucket brigade
action selection
reinforcement learning algorithms
multi agent reinforcement learning
monte carlo
optimal policy
temporal difference learning
information retrieval
potential field
neural network