Reward Uncertainty for Exploration in Preference-based Reinforcement Learning.
Xinran LiangKatherine ShuKimin LeePieter AbbeelPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- exploration strategy
- action selection
- exploration exploitation
- partial observability
- model based reinforcement learning
- state space
- function approximation
- reinforcement learning algorithms
- active exploration
- reward function
- model free
- balancing exploration and exploitation
- eligibility traces
- multi agent
- machine learning
- partially observable environments
- autonomous learning
- markov decision processes
- optimal policy
- learning agent
- learning process
- supervised learning
- state action
- learning algorithm
- decision theory
- average reward
- expected utility
- temporal difference
- uncertain data
- user preferences
- bayesian networks
- decision making