Login / Signup
Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk.
Tianrui Chen
Aditya Gangrade
Venkatesh Saligrama
Published in:
CoRR (2022)
Keyphrases
</>
multi armed bandits
bandit problems
multi armed bandit problems
multi armed bandit
regret bounds
worst case
decision problems
machine learning
decision making
online learning
expected utility
loss function
information theoretic
learning theory
optimal strategy