Unifying the stochastic and the adversarial Bandits with Knapsack.

Anshuka Rangi Massimo Franceschetti Long Tran-Thanh

Published in: CoRR (2018)

Keyphrases

stochastic systems
dynamic programming
knapsack problem
monte carlo
learning automata
data sets
regret bounds
stochastic optimization
multi armed bandits
stochastic models
multiple choice
artificial intelligence
confidence intervals
expert systems
markov processes
multiscale
machine learning
multi armed bandit
real time