Login / Signup
Non-asymptotic analysis of a new bandit algorithm for semi-bounded rewards.
Junya Honda
Akimichi Takemura
Published in:
J. Mach. Learn. Res. (2015)
Keyphrases
</>
optimal solution
learning algorithm
objective function
cost function
asymptotic analysis
reinforcement learning
computational complexity
np hard
search space
sufficient conditions