Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards.
Haoxiang WangYong LinWei XiongRui YangShizhe DiaoShuang QiuHan ZhaoTong ZhangPublished in: CoRR (2024)
Keyphrases
- user preferences
- multi objective
- user behavior
- recommender systems
- user profiles
- collaborative filtering
- multi objective optimization
- evolutionary algorithm
- preference models
- control system
- recommendation systems
- user behaviour
- user specific
- preference model
- qualitative preferences
- user feedback
- personalized recommendation
- recommendation algorithms
- genetic algorithm
- optimization algorithm
- objective function
- reinforcement learning
- markov decision processes
- multiple objectives
- multiple criteria
- particle swarm optimization
- decision makers
- pairwise
- similarity measure
- neural network
- hierarchical task networks