Login / Signup

Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization.

Wenkai YangShiqi ShenGuangyao ShenZhi GongYankai Lin
Published in: CoRR (2024)
Keyphrases
  • probabilistic model
  • data mining
  • expert systems
  • pairwise
  • hidden markov models
  • least squares
  • modeling framework