Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards.
Haoxiang WangYong LinWei XiongRui YangShizhe DiaoShuang QiuHan ZhaoTong ZhangPublished in: ACL (1) (2024)
Keyphrases
- user preferences
- multi objective
- user behavior
- multi objective optimization
- qualitative preferences
- user profiles
- recommender systems
- optimization algorithm
- collaborative filtering
- control system
- evolutionary algorithm
- user feedback
- user behaviour
- recommendation systems
- reinforcement learning
- multiple objectives
- preference model
- objective function
- recommendation algorithms
- decision making
- user interests
- multi criteria
- particle swarm optimization
- preference models
- genetic algorithm