Login / Signup
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems.
Sharan Vaswani
Abbas Mehrabian
Audrey Durand
Branislav Kveton
Published in:
AISTATS (2020)
Keyphrases
</>
bandit problems
decision problems
exploration exploitation
multi armed bandits
learning algorithm
decision making
cooperative
dynamic programming
upper bound
decentralized decision making