Login / Signup
Direct Preference Optimization: Your Language Model is Secretly a Reward Model.
Rafael Rafailov
Archit Sharma
Eric Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
Published in:
CoRR (2023)
Keyphrases
</>
language model
probabilistic model
statistical model
search engine
probability distribution
web search
language modelling