Login / Signup

Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence.

Junru LuJiazheng LiSiyu AnMeng ZhaoYulan HeDi YinXing Sun
Published in: CoRR (2024)
Keyphrases