Login / Signup
No Discounted-Regret Learning in Adversarial Bandits with Delays.
Ilai Bistritz
Zhengyuan Zhou
Xi Chen
Nicholas Bambos
Jose H. Blanchet
Published in:
CoRR (2021)
Keyphrases
</>
online learning
reinforcement learning
learning process
learning algorithm
active learning
data sets
multi agent
prior knowledge
supervised learning
knowledge acquisition
unsupervised learning
learning problems
multi armed bandits