On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits.

Zixin Zhong Wang Chi Cheung Vincent Y. F. Tan

Published in: CoRR (2021)

Keyphrases

regret minimization
stochastic systems
pareto frontier
game theoretic
objective function
multi objective
markov chain
multi objective optimization
nash equilibrium
neural network
artificial intelligence
special case