Login / Signup

Self-Play Preference Optimization for Language Model Alignment.

Yue WuZhiqing SunHuizhuo YuanKaixuan JiYiming YangQuanquan Gu
Published in: CoRR (2024)
Keyphrases