Unifying the Stochastic and the Adversarial Bandits with Knapsack.

Anshuka Rangi Massimo Franceschetti Long Tran-Thanh

Published in: IJCAI (2019)

Keyphrases

stochastic systems
dynamic programming
upper bound
regret bounds
knapsack problem
multi agent
stochastic optimization
stochastic models
multi armed bandit
machine learning
optimal solution
search algorithm
special case
multistage
monte carlo