WPO: Enhancing RLHF with Weighted Preference Optimization.
Wenxuan ZhouRavi AgrawalShujian ZhangSathish Reddy IndurthiSanqiang ZhaoKaiqiang SongSilei XuChenguang ZhuPublished in: CoRR (2024)
Keyphrases
- optimization problems
- evolutionary algorithm
- optimization algorithm
- artificial intelligence
- evolution strategy
- multiscale
- global optimization
- optimization methods
- optimization process
- discrete optimization
- information systems
- search algorithm
- simulated annealing
- user preferences
- optimization method
- optimization strategies