Login / Signup

SALMON: Self-Alignment with Principle-Following Reward Models.

Zhiqing SunYikang ShenHongxin ZhangQinhong ZhouZhenfang ChenDavid D. CoxYiming YangChuang Gan
Published in: CoRR (2023)
Keyphrases
  • model selection
  • statistical models
  • information systems
  • face recognition
  • training data
  • reinforcement learning
  • parameter estimation
  • statistical model
  • experimental data