A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit.
Bastian OetomoMalinga PereraRenata Borovica-GajicBenjamin I. P. RubinsteinPublished in: CoRR (2019)
Keyphrases
- bandit problems
- multi armed bandit
- decision problems
- contextual information
- multi armed bandit problems
- upper bound
- regret bounds
- upper confidence bound
- context sensitive
- reinforcement learning
- context aware
- real time
- context dependent
- expected utility
- real world
- lower bound
- online learning
- multi agent
- information retrieval