Login / Signup
A minimax and asymptotically optimal algorithm for stochastic bandits.
Pierre Ménard
Aurélien Garivier
Published in:
ALT (2017)
Keyphrases
</>
asymptotically optimal
dynamic programming
computational complexity
worst case
monte carlo
learning algorithm
objective function
search space
simulated annealing
machine learning
optimal solution
lower bound
np hard
evaluation function
lot sizing
game tree