LIRE: listwise reward enhancement for preference alignment.
Mingye ZhuYi LiuLei ZhangJunbo GuoZhendong MaoPublished in: CoRR (2024)
Keyphrases
- learning to rank
- pairwise
- loss function
- reinforcement learning
- balancing exploration and exploitation
- ranking functions
- ranking algorithm
- multiple imputation
- evaluation measures
- learning to rank algorithms
- missing data
- query dependent
- document retrieval
- image processing
- user preferences
- information retrieval
- statistical databases
- similarity measure
- multi attribute
- ranking models
- support vector
- ranking svm
- web search
- collaborative filtering
- normal form
- machine learning
- multi criteria
- retrieval systems
- preference elicitation
- multi class