Finding Optimal Arms in Non-stochastic Combinatorial Bandits with Semi-bandit Feedback and Finite Budget.
Jasmin BrandtBjörn HaddenhorstViktor BengsEyke HüllermeierPublished in: CoRR (2022)
Keyphrases
- finding optimal
- multi armed bandits
- multi armed bandit
- multi armed bandit problems
- bandit problems
- regret bounds
- reinforcement learning
- stochastic systems
- optimal or near optimal
- discrete random variables
- decision problems
- multi dimensional
- relevance feedback
- index structure
- monte carlo
- game tree
- dynamic programming
- lower bound