Login / Signup

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning.

Yifang ChenShuohang WangZiyi YangHiteshi SharmaNikos KarampatziakisDonghan YuKevin G. JamiesonSimon Shaolei DuYelong Shen
Published in: CoRR (2024)
Keyphrases