Inverse Preference Learning: Preference-based RL without a Reward Function.
Joey HejnaDorsa SadighPublished in: NeurIPS (2023)
Keyphrases
- reward function
- preference learning
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- ordinal regression
- optimal policy
- state space
- gaussian processes
- pairwise comparison
- multiple agents
- recommender systems
- preference relations
- inverse reinforcement learning
- active learning
- ranking functions
- multi agent
- transition probabilities
- model free
- supervised learning
- learning process
- reward signal
- state variables
- user preferences
- dynamic programming
- temporal difference
- gaussian process
- learning problems
- sufficient conditions
- markov chain
- learning algorithm
- machine learning