Logarithmic Regret for Matrix Games against an Adversary with Noisy Bandit Feedback.
Arnab MaitiKevin G. JamiesonLillian J. RatliffPublished in: CoRR (2023)
Keyphrases
- regret bounds
- bandit problems
- weighted majority
- lower bound
- linear regression
- online learning
- expert advice
- worst case
- game theory
- multi armed bandit
- computer games
- upper bound
- stackelberg game
- upper confidence bound
- relevance feedback
- nash equilibrium
- least squares
- game design
- user feedback
- game theoretic
- game playing
- online convex optimization
- regret minimization
- video games
- noisy measurements
- multi armed bandit problems
- game play
- educational games
- low rank
- singular value decomposition
- bregman divergences
- nash equilibria
- noisy data
- covariance matrix
- loss function