Login / Signup

The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback.

Nathan O. LambertRoberto Calandra
Published in: CoRR (2023)
Keyphrases