Login / Signup

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs.

Rui YangRuomeng DingYong LinHuan ZhangTong Zhang
Published in: CoRR (2024)
Keyphrases
  • reinforcement learning
  • prior knowledge
  • probabilistic model
  • active learning
  • statistical model
  • learning algorithm
  • bayesian framework
  • graphical models
  • unsupervised learning