Login / Signup
Interactively Learning the User's Utility for Best-Arm Identification in Multi-Objective Multi-Armed Bandits.
Mathieu Reymond
Eugenio Bargiacchi
Diederik M. Roijers
Ann Nowé
Published in:
AAMAS (2024)
Keyphrases
</>
multi armed bandits
multi objective
learning algorithm
learning process
learning tasks
optimization algorithm
reinforcement learning
evolutionary algorithm
supervised learning
support vector
active learning
special case
least squares
online learning
dynamical systems