PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning.
Simon HolkDaniel MartaIolanda LeitePublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- programming language
- optimal policy
- language learning
- machine learning
- language processing
- decision making
- state space
- model free
- function approximation
- conditional plans
- data sets
- markov decision process
- reinforcement learning algorithms
- markov decision processes
- recommender systems
- multi agent
- learning algorithm
- object categories
- user preferences
- action selection
- temporal difference
- natural language
- pattern languages