Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints.
Chaoqi WangYibo JiangChenghao YangHan LiuYuxin ChenPublished in: ICLR (2024)
Keyphrases
- constrained optimization
- kullback leibler
- soft constraints
- optimization algorithm
- wide variety
- optimization problems
- global optimization
- kl divergence
- constraint satisfaction
- optimization model
- np hard optimization problems
- divergence measure
- hard constraints
- multi attribute
- optimization method
- multi criteria
- optimization process
- cross entropy
- user preferences
- decision makers
- real world