Login / Signup
Mitigating Fine-tuning Jailbreak Attack with Backdoor Enhanced Alignment.
Jiongxiao Wang
Jiazhao Li
Yiquan Li
Xiangyu Qi
Junjie Hu
Yixuan Li
Patrick McDaniel
Muhao Chen
Bo Li
Chaowei Xiao
Published in:
CoRR (2024)
Keyphrases
</>
fine tuning
viable alternative
fine tune
image alignment
fine tuned
dynamic time warping
countermeasures
detection mechanism
neural network
genetic algorithm
decision support system
attack detection