Login / Signup

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences.

Corby RossetChing-An ChengArindam MitraMichael SantacroceAhmed AwadallahTengyang Xie
Published in: CoRR (2024)
Keyphrases