Login / Signup

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer.

Zhihan LiuMiao LuShenao ZhangBoyi LiuHongyi GuoYingxiang YangJose H. BlanchetZhaoran Wang
Published in: CoRR (2024)
Keyphrases
  • total variation
  • multi agent
  • worst case
  • neural network
  • semi supervised
  • risk management
  • web services
  • data structure
  • image restoration
  • regularization framework