Combining No-regret and Q-learning.

Ian A. Kash Michael Sullins Katja Hofmann

Published in: CoRR (2019)

Keyphrases

reinforcement learning
cooperative
online learning
multi agent
lower bound
decision making
state space
learning algorithm
multi class
least squares
function approximation
learning rate
binary classification
potential field
bucket brigade