Allowing for The Grounded Use of Temporal Difference Learning in Large Ranking Models via Substate Updates.
Daniel CohenPublished in: SIGIR (2021)
Keyphrases
- temporal difference learning
- ranking models
- function approximation
- fixed point
- learning to rank
- game playing
- reinforcement learning
- evaluation function
- ranking functions
- web search
- ranking algorithm
- temporal difference
- loss function
- markov decision process
- reinforcement learning algorithms
- benchmark datasets
- web search engines
- supervised learning
- feature space
- model free
- policy iteration
- information retrieval
- neural network
- training data
- machine learning