Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits.
Jongyeong LeeJunya HondaChao-Kai ChiangMasashi SugiyamaPublished in: CoRR (2023)
Keyphrases
- average reward
- multi objective
- multi armed bandit
- random sampling
- monte carlo
- sample size
- genetic algorithm
- prior information
- multi objective optimization
- stochastic systems
- sampling algorithm
- multiobjective optimization
- parameter space
- bayesian framework
- sampling methods
- pareto optimal
- machine learning
- pareto optimality
- sampling strategies
- prior probabilities
- multiple objectives
- neural network
- differential evolution
- bayesian networks