Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards.

Junya Honda Akimichi Takemura

Published in: J. Mach. Learn. Res. (2015)

Keyphrases

optimal solution
learning algorithm
objective function
cost function
asymptotic analysis
reinforcement learning
computational complexity
np hard
search space
sufficient conditions