Login / Signup
Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions.
Haoxian Chen
Hanyang Zhao
Henry Lam
David D. Yao
Wenpin Tang
Published in:
CoRR (2024)
Keyphrases
</>
fine tune
fine tuning
user preferences
learning algorithm
petri net