Login / Signup
Some algorithms for correlated bandits with non-stationary rewards: Regret bounds and applications.
Prathamesh Mayekar
Nandyala Hemachandra
Published in:
CODS (2016)
Keyphrases
</>
non stationary
adaptive algorithms
regret bounds
multi armed bandit
learning algorithm
multi armed bandits
computational complexity
reinforcement learning
worst case
mutual information
markov decision processes
bandit problems