Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees.
Andrea TirinzoniMatteo PapiniAhmed TouatiAlessandro LazaricMatteo PirottaPublished in: CoRR (2022)
Keyphrases
- learning algorithm
- reinforcement learning
- multi armed bandits
- online learning
- learning process
- supervised learning
- learning systems
- unsupervised learning
- learning problems
- machine learning
- objective function
- support vector
- lower bound
- training data
- e learning
- knowledge acquisition
- mobile learning
- learning analytics
- visual representation
- multiple representations