Combinatorial Bandits with Full-Bandit Feedback: Sample Complexity and Regret Minimization.

Idan Rejwan Yishay Mansour

Published in: CoRR (2019)

Keyphrases

sample complexity
regret minimization
regret bounds
lower bound
theoretical analysis
upper bound
multi armed bandit
learning algorithm
pac learning
learning problems
active learning
special case
game theoretic
supervised learning
vc dimension
generalization error
nash equilibrium
concept classes
random sampling
multi armed bandit problems
sample size
training examples
machine learning algorithms
small number
multi agent learning
game theory
optimal solution
multi agent
data sets