Login / Signup

Mallows-DPO: Fine-Tune Your LLM with Preference Dispersions.

Haoxian ChenHanyang ZhaoHenry LamDavid D. YaoWenpin Tang
Published in: CoRR (2024)
Keyphrases
  • fine tune
  • fine tuning
  • user preferences
  • learning algorithm
  • petri net