Login / Signup
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence.
Junru Lu
Jiazheng Li
Siyu An
Meng Zhao
Yulan He
Di Yin
Xing Sun
Published in:
CoRR (2024)
Keyphrases
</>
kl divergence
kullback leibler
information theoretic
kullback leibler divergence
mahalanobis distance
gaussian mixture
information theory
distance measure
gaussian distribution
posterior distribution
probabilistic latent semantic analysis