Login / Signup
Learning Optimal Advantage from Preferences and Mistaking It for Reward.
W. Bradley Knox
Stephane Hatgis-Kessell
Sigurdur O. Adalgeirsson
Serena Booth
Anca D. Dragan
Peter Stone
Scott Niekum
Published in:
AAAI (2024)
Keyphrases
</>
reinforcement learning
learning process
learning algorithm
learning problems
learning systems
dynamic programming
learning scheme
data sets
neural network
e learning
decision trees
optimal solution
supervised learning
learning capabilities
learning agent