Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination.

Arpan Mukherjee Ali Tajer Pin-Yu Chen Payel Das

Published in: CoRR (2021)

Keyphrases

multi armed bandit
multi armed bandit problems
stochastic systems
bandit problems
reinforcement learning
automatic identification
monte carlo
database
stochastic optimization
stochastic programming
case study
stochastic processes
regret bounds
multi armed bandits