Login / Signup
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits.
Jean-Yves Audibert
Rémi Munos
Csaba Szepesvári
Published in:
Theor. Comput. Sci. (2009)
Keyphrases
</>
multi armed bandits
exploration exploitation tradeoff
bandit problems
neural network
feature selection
objective function
computational complexity
state space
knn
upper bound
sufficient conditions
learning tasks
function approximation
multi armed bandit