Adversarial Multi-Armed Bandit Approach to Stochastic Optimization.

Hyeong Soo Chang Michael C. Fu Steven I. Marcus

Published in: CDC (2006)

Keyphrases

stochastic optimization
multi armed bandit
multi armed bandits
reinforcement learning
multistage
multi agent
decentralized decision making
regret bounds
lower bound
dynamic programming
linear programming
missing data
bandit problems