Reinforcement Learning to Rank with Pairwise Policy Gradient.
Jun XuZeng WeiLong XiaYanyan LanDawei YinXueqi ChengJi-Rong WenPublished in: SIGIR (2020)
Keyphrases
- learning to rank
- policy gradient
- pairwise
- reinforcement learning
- loss function
- function approximation
- ranking functions
- information retrieval
- reinforcement learning algorithms
- optimal control
- evaluation measures
- supervised learning
- gradient method
- similarity measure
- document retrieval
- markov random field
- semi supervised
- state space
- average reward
- partially observable markov decision processes
- state action
- machine learning
- ranking svm
- pairwise classification
- learning to rank algorithms
- markov decision processes
- optimal policy
- model free
- retrieval systems
- approximation methods
- simulated annealing
- collaborative filtering
- learning process
- support vector
- multi agent
- feature extraction
- learning algorithm