Sign in

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback.

Tianyu YuYuan YaoHaoye ZhangTaiwen HeYifeng HanGanqu CuiJinyi HuZhiyuan LiuHai-Tao ZhengMaosong SunTat-Seng Chua
Published in: CoRR (2023)
Keyphrases
  • fine grained
  • coarse grained
  • human behavior
  • tightly coupled
  • access control
  • massively parallel
  • human subjects
  • human teacher
  • web search
  • human operators
  • data provenance