Hindsight Preference Learning for Offline Preference-based Reinforcement Learning.
Chen-Xiao GaoShengjun FangChenjun XiaoYang YuZongzhang ZhangPublished in: CoRR (2024)
Keyphrases
- preference learning
- reinforcement learning
- ordinal regression
- gaussian processes
- pairwise comparison
- active learning
- recommender systems
- ranking functions
- preference relations
- user preferences
- learning algorithm
- learning process
- pairwise
- machine learning
- gaussian process
- supervised learning
- state space
- closed form
- multi objective
- multi agent
- similarity measure