RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback.

Published in: CoRR (2023)

Keyphrases