Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time.
Thibaut CuvelierRichard CombesEric GourdinPublished in: CoRR (2021)
Keyphrases
- optimal strategy
- decision problems
- worst case
- monte carlo
- special case
- computational complexity
- expected cost
- approximation algorithms
- mathematical models
- expected utility
- asymptotically optimal
- stochastic systems
- influence diagrams
- bounded treewidth
- multi armed bandit
- central limit theorem
- data sets
- utility function
- experimental data
- sample size
- finite automata