Lifelong Bandit Optimization: No Prior and No Regret.
Felix SchurParnian KassraieJonas RothfussAndreas KrausePublished in: CoRR (2022)
Keyphrases
- bandit problems
- optimization algorithm
- global optimization
- optimization process
- optimization problems
- online learning
- loss function
- utility elicitation
- regret bounds
- random sampling
- optimization method
- neural network
- multi objective
- prior knowledge
- support vector
- upper confidence bound
- competence development
- multi armed bandit problems
- lifelong learning
- optimization model
- combinatorial optimization
- resource allocation
- worst case
- least squares