Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback.
Yifu YuanJianye HaoYi MaZibin DongHebin LiangJinyi LiuZhixin FengKai ZhaoYan ZhengPublished in: ICLR (2024)
Keyphrases
- benchmark suite
- reinforcement learning
- real time
- function approximation
- human operators
- wide variety
- human subjects
- user engagement
- feedback mechanisms
- reinforcement learning algorithms
- relevance feedback
- information systems
- transfer learning
- state space
- dynamic programming
- human interaction
- model free
- artificial intelligence
- temporal difference learning
- turing machine
- autonomous learning
- real world
- neural network
- data sets