A structured multiarmed bandit problem and the greedy policy.

Adam J. Mersereau Paat Rusmevichientong John N. Tsitsiklis

Published in: CDC (2008)

Keyphrases

multiarmed bandit
optimal policy
search algorithm
greedy algorithm
dynamic programming
structured data
greedy algorithms
greedy heuristic
feature selection
special case
genetic algorithm
active learning
decision process
asymptotically optimal
objective function
decision making
structured learning
policy making
policy search