Login / Signup

Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model.

Qi GouCam-Tu Nguyen
Published in: CoRR (2024)
Keyphrases