Login / Signup
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs.
Rui Yang
Ruomeng Ding
Yong Lin
Huan Zhang
Tong Zhang
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
prior knowledge
probabilistic model
active learning
statistical model
learning algorithm
bayesian framework
graphical models
unsupervised learning