Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack.

Yu Fu Yufei Li Wen Xiao Cong Liu Yue Dong

Published in: CoRR (2023)

Keyphrases

natural language processing
contextual information
natural language
neural network
learning algorithm
context aware
text summarization
text analysis
artificial intelligence
search engine
semi supervised
wordnet
context sensitive
countermeasures