Preference-Guided Reinforcement Learning for Efficient Exploration.
Guojian WangFaguo WuXiao ZhangTianyuan ChenXuyang ChenLin ZhaoPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- optimal policy
- user preferences
- robotic control
- preference elicitation
- markov decision processes
- temporal difference
- direct policy search
- multi agent reinforcement learning
- machine learning
- temporal difference learning
- stochastic approximation
- model free
- soft constraints
- data sets
- learning process
- multi agent
- neural network
- real robot
- function approximators
- learning agents
- dynamical systems
- learning algorithm
- real world
- qualitative preferences