Sign in

COPF: Continual Learning Human Preference through Optimal Policy Fitting.

Han ZhangLin GuiYuanzhao ZhaiHui WangYu LeiRuifeng Xu
Published in: CoRR (2023)
Keyphrases