Login / Signup
Inferring Lexicographically-Ordered Rewards from Preferences.
Alihan Hüyük
William R. Zame
Mihaela van der Schaar
Published in:
CoRR (2022)
Keyphrases
</>
reinforcement learning
user preferences
multi attribute
soft constraints
decision making
preference elicitation
individual preferences
data sets
case study
logic programs
constraint satisfaction problems
attribute values
preference relations
multiarmed bandit
multi armed bandits