The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization.

Sian Gooding Hassan Mansoor

Published in: CoRR (2023)

Keyphrases

reinforcement learning
case study
test bed
function approximation
human users
human operators
multi document summarization
learning algorithm
personality traits
relevance feedback
soft constraints
summary generation
human teacher
reinforcement learning algorithms
co occurrence
learning process
multi agent
artificial intelligence