Login / Signup

Fine-Tuning Language Models with Advantage-Induced Policy Alignment.

Banghua ZhuHiteshi SharmaFelipe Vieira FrujeriShi DongChenguang ZhuMichael I. JordanJiantao Jiao
Published in: CoRR (2023)
Keyphrases