Login / Signup
Linear Upper Confidence Bound Algorithm for Contextual Bandit Problem with Piled Rewards.
Kuan-Hao Huang
Hsuan-Tien Lin
Published in:
PAKDD (2) (2016)
Keyphrases
</>
upper confidence bound
contextual bandit
learning algorithm
computational complexity
reinforcement learning
dynamic programming
objective function
optimal solution
k means
machine learning
information retrieval