Multi-armed Bandit Requiring Monotone Arm Sequences.

Published in: CoRR (2021)

Keyphrases

multi armed bandit
multi armed bandits
hidden markov models
reinforcement learning
machine learning
upper bound
objective function
lower bound
probabilistic model
mutual information
maximum likelihood
em algorithm
closed form
decentralized decision making