Adaptive Preference Scaling for Reinforcement Learning with Human Feedback.
Ilgee HongZichong LiAlexander BukharinYixiao LiHaoming JiangTianbao YangTuo ZhaoPublished in: CoRR (2024)
Keyphrases
- reinforcement learning
- human interaction
- artificial intelligence
- adaptive control
- model free
- supervised learning
- human behavior
- function approximation
- motor skills
- human operators
- multi attribute
- dynamic programming
- learning algorithm
- human subjects
- optimal control
- user preferences
- collaborative filtering
- neural network
- learning capabilities
- sensory inputs
- real time