On the Sublinear Regret of GP-UCB.

Justin Whitehouse Aaditya Ramdas Zhiwei Steven Wu

Published in: NeurIPS (2023)

Keyphrases

bandit problems
genetic programming
multi armed bandit
multi armed bandit problems
decision problems
fitness function
online learning
reinforcement learning
lower bound
confidence bounds
upper confidence bound
regret bounds
expert advice
evolutionary algorithm
upper bound
loss function
worst case
cost sensitive
binary classification
expected utility
gradient projection
genetic algorithm