Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms.

Alihan Hüyük Cem Tekin

Published in: CoRR (2018)

Keyphrases

multi armed bandit
multi armed bandits
reinforcement learning
decentralized decision making
regret bounds
machine learning
learning algorithm
similarity measure
multi agent
upper bound
statistical models