Login / Signup
SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance.
Caishuang Huang
Wanxu Zhao
Rui Zheng
Huijie Lv
Shihan Dou
Sixian Li
Xiao Wang
Enyu Zhou
Junjie Ye
Yuming Yang
Tao Gui
Qi Zhang
Xuanjing Huang
Published in:
CoRR (2024)
Keyphrases
</>
image alignment
stereo matching
countermeasures
security threats
stereo images
dynamic time warping
malicious users
data mining
disparity estimation
dos attacks
security risks
computer vision
disparity map
stereo correspondence
chosen plaintext