Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts.
Haoxiang WangWei XiongTengyang XieHan ZhaoTong ZhangPublished in: CoRR (2024)
Keyphrases
- multi objective
- evolutionary algorithm
- multiple objectives
- multi objective optimization
- optimization algorithm
- particle swarm optimization
- genetic algorithm
- user preferences
- decision making
- evolutionary optimization
- mixture model
- reinforcement learning
- machine learning
- pareto optimal
- multiobjective optimization
- bi objective
- human experts
- multiple criteria