BanditQ - No-Regret Learning with Guaranteed Per-User Rewards in Adversarial Environments.

Published in: CoRR (2023)

Keyphrases

online learning
reinforcement learning
learning process
learning algorithm
distributed learning
learning systems
bandit problems
user interface
prior knowledge
learning problems
knowledge acquisition
autonomous robots
binary classification
dynamic environments
user experience
collaborative filtering
supervised learning
mobile devices
multi armed bandits