Minimax Policies for Adversarial and Stochastic Bandits.

Jean-Yves Audibert Sébastien Bubeck

Published in: COLT (2009)

Keyphrases

stochastic systems
control policies
minimax search
monte carlo
optimal policy
stochastic inventory control
multi armed bandit problems
evaluation function
stochastic model
stochastic optimization
stochastic models
echelon stock
multi agent
multi armed bandits
asymptotic properties
multi armed bandit
regret bounds
state dependent
base stock policies
learning automata