Preference-Based Reinforcement Learning Using Dyad Ranking.
Dirk SchäferEyke HüllermeierPublished in: DS (2018)
Keyphrases
- reinforcement learning
- function approximation
- ranking algorithm
- learning to rank
- ranking functions
- state space
- web search
- reinforcement learning algorithms
- multi agent
- learning algorithm
- machine learning
- link analysis
- model free
- reinforcement learning methods
- temporal difference learning
- information retrieval
- rank order
- temporal difference
- short list
- spam detection
- evaluation measures
- ranked list
- optimal policy
- learning process