Combining No-regret and Q-learning.

Ian A. Kash Michael Sullins Katja Hofmann

Published in: AAMAS (2020)

Keyphrases

reinforcement learning
learning algorithm
cooperative
state space
lower bound
online learning
expert advice
active learning
function approximation
learning rate
reward function
reinforcement learning algorithms
pairwise
control system
dynamic programming
machine learning
optimal policy
combining multiple
action selection