Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey.

Zhichen Dong Zhanhui Zhou Chao Yang Jing Shao Yu Qiao

Published in: NAACL-HLT (2024)

Keyphrases

denial of service attacks
denial of service
dos attacks
countermeasures
network security
lightweight
spam filters
malicious attacks
traffic analysis
ddos attacks
natural language
traffic accidents
machine learning
block cipher
conversational agent
computer security
attack detection
image watermarking
security threats
machine learning systems
conversational agents
intrusion detection
nuclear power plant