Login / Signup
Optimality of myopic policy for a class of monotone affine restless multi-armed bandits.
Parisa Mansourifard
Tara Javidi
Bhaskar Krishnamachari
Published in:
CDC (2012)
Keyphrases
</>
multi armed bandits
dynamic programming
infinite horizon
decision making
optimal control
optimal policy
finite number
average reward