Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment.
Rui YangXiaoman PanFeng LuoShuang QiuHan ZhongDong YuJianshu ChenPublished in: CoRR (2024)
Keyphrases
- multi objective
- evolutionary algorithm
- parameter estimation
- image alignment
- contextual information
- learning algorithm
- multi objective optimization
- context sensitive
- statistical model
- dynamic environments
- probabilistic model
- trade off
- reinforcement learning
- context aware
- particle swarm optimization
- probability distribution
- prior knowledge
- statistical models
- multiple criteria
- genetic algorithm