Chained Information-Theoretic bounds and Tight Regret Rate for Linear Bandit Problems.
Amaury GouverneurBorja Rodríguez GálvezTobias J. OechteringMikael SkoglundPublished in: CoRR (2024)
Keyphrases
- information theoretic
- bandit problems
- lower bound
- upper bound
- mutual information
- information theory
- worst case
- decision problems
- multi armed bandits
- multi armed bandit problems
- theoretic framework
- regret bounds
- jensen shannon divergence
- information bottleneck
- information theoretic measures
- bregman divergences
- relative entropy
- log likelihood
- minimum description length
- entropy measure
- expected utility
- kl divergence
- decision making
- kullback leibler divergence
- image registration
- np hard
- optimal solution
- jensen shannon
- closed form
- dynamic programming
- multi objective
- objective function
- similarity measure