Gambler Bandits and the Regret of Being Ruined.
Filipo Studzinski PerottoSattar VakiliPratik GajaneYaser FaghanMathieu BourgaisPublished in: AAMAS (2021)
Keyphrases
- regret bounds
- multi armed bandit problems
- multi armed bandit
- multi armed bandits
- lower bound
- bandit problems
- online learning
- linear regression
- upper bound
- expert advice
- confidence bounds
- stochastic systems
- reinforcement learning
- regret minimization
- loss function
- neural network
- computational complexity
- minimax regret
- decision problems