The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization.
Sian GoodingHassan MansoorPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- case study
- test bed
- function approximation
- human users
- human operators
- multi document summarization
- learning algorithm
- personality traits
- relevance feedback
- soft constraints
- summary generation
- human teacher
- reinforcement learning algorithms
- co occurrence
- learning process
- multi agent
- artificial intelligence