Login / Signup

Optimality of myopic policy for a class of monotone affine restless multi-armed bandits.

Parisa MansourifardTara JavidiBhaskar Krishnamachari
Published in: CDC (2012)
Keyphrases
  • multi armed bandits
  • dynamic programming
  • infinite horizon
  • decision making
  • optimal control
  • optimal policy
  • finite number
  • average reward