PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning.
Simon HolkDaniel MartaIolanda LeitePublished in: HRI (2024)
Keyphrases
- reinforcement learning
- programming language
- natural language
- markov decision processes
- language learning
- user preferences
- state space
- function approximation
- optimal policy
- transfer learning
- machine learning
- learning process
- target language
- reinforcement learning algorithms
- soft constraints
- specification language
- preference elicitation
- individual preferences