Login / Signup
Preference-free Alignment Learning with Regularized Relevance Reward.
Sungdong Kim
Minjoon Seo
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
online learning
learning algorithm
learning process
neural network
machine learning
active learning
least squares
learning systems
learning tasks
inductive inference
preference learning
information retrieval
multi agent
knowledge acquisition
solving problems