Sign in

Linear Upper Confidence Bound Algorithm for Contextual Bandit Problem with Piled Rewards.

Kuan-Hao HuangHsuan-Tien Lin
Published in: PAKDD (2) (2016)
Keyphrases
  • upper confidence bound
  • contextual bandit
  • learning algorithm
  • computational complexity
  • reinforcement learning
  • dynamic programming
  • objective function
  • optimal solution
  • k means
  • machine learning
  • information retrieval