Login / Signup

Approximate Indexability and Bandit Problems with Concave Rewards and Delayed Feedback.

Sudipto GuhaKamesh Munagala
Published in: APPROX-RANDOM (2013)
Keyphrases
  • bandit problems
  • delayed feedback
  • multi armed bandits
  • decision problems
  • piecewise linear
  • exploration exploitation
  • objective function
  • decentralized decision making
  • learning algorithm
  • lower bound
  • genetic algorithm