Login / Signup
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training.
Youliang Yuan
Wenxiang Jiao
Wenxuan Wang
Jen-tse Huang
Jiahao Xu
Tian Liang
Pinjia He
Zhaopeng Tu
Published in:
CoRR (2024)
Keyphrases
</>
training examples
training process
training set
training algorithm
training phase
data mining
artificial intelligence
information systems
web pages
information technology
small number
supervised learning
online learning
training samples