A structured multiarmed bandit problem and the greedy policy.
Adam J. MersereauPaat RusmevichientongJohn N. TsitsiklisPublished in: CDC (2008)
Keyphrases
- multiarmed bandit
- optimal policy
- search algorithm
- greedy algorithm
- dynamic programming
- structured data
- greedy algorithms
- greedy heuristic
- feature selection
- special case
- genetic algorithm
- active learning
- decision process
- asymptotically optimal
- objective function
- decision making
- structured learning
- policy making
- policy search