Login / Signup

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis.

Yuping LinPengfei HeHan XuYue XingMakoto YamadaHui LiuJiliang Tang
Published in: CoRR (2024)
Keyphrases
  • neural network
  • machine learning
  • data analysis
  • statistical analysis
  • database
  • learning algorithm
  • database systems
  • recommender systems
  • quantitative analysis
  • automated analysis
  • reduced dimensionality