Combinatorial Bandits with Full-Bandit Feedback: Sample Complexity and Regret Minimization.
Idan RejwanYishay MansourPublished in: CoRR (2019)
Keyphrases
- sample complexity
- regret minimization
- regret bounds
- lower bound
- theoretical analysis
- upper bound
- multi armed bandit
- learning algorithm
- pac learning
- learning problems
- active learning
- special case
- game theoretic
- supervised learning
- vc dimension
- generalization error
- nash equilibrium
- concept classes
- random sampling
- multi armed bandit problems
- sample size
- training examples
- machine learning algorithms
- small number
- multi agent learning
- game theory
- optimal solution
- multi agent
- data sets