Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time.
Thibaut CuvelierRichard CombesEric GourdinPublished in: ALT (2021)
Keyphrases
- optimal strategy
- worst case
- decision problems
- special case
- monte carlo
- expected cost
- sample size
- stochastic systems
- computational complexity
- approximation algorithms
- mathematical models
- machine learning
- cooperative game
- bounded treewidth
- expected utility
- cooperative
- markov chain
- finite automata
- np hardness
- e learning
- decision making
- neural network
- min sum