Login / Signup

Uncovering Safety Risks in Open-source LLMs through Concept Activation Vector.

Zhihao XuRuixuan HuangXiting WangFangzhao WuJing YaoXing Xie
Published in: CoRR (2024)
Keyphrases
  • open source
  • open source software
  • source code
  • decision making
  • real time
  • united states
  • feature vectors
  • risk factors
  • risk analysis