On effectiveness of the Mirror Decent Algorithm for a stochastic multi-armed bandit governed by a stationary finite Markov chain.

Published in: AuCC (2013)

Keyphrases