Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time.

Thibaut Cuvelier Richard Combes Eric Gourdin

Published in: ALT (2021)

Keyphrases

optimal strategy
worst case
decision problems
special case
monte carlo
expected cost
sample size
stochastic systems
computational complexity
approximation algorithms
mathematical models
machine learning
cooperative game
bounded treewidth
expected utility
cooperative
markov chain
finite automata
np hardness
e learning
decision making
neural network
min sum