Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey.
Zhichen DongZhanhui ZhouChao YangJing ShaoYu QiaoPublished in: NAACL-HLT (2024)
Keyphrases
- denial of service attacks
- denial of service
- dos attacks
- countermeasures
- network security
- lightweight
- spam filters
- malicious attacks
- traffic analysis
- ddos attacks
- natural language
- traffic accidents
- machine learning
- block cipher
- conversational agent
- computer security
- attack detection
- image watermarking
- security threats
- machine learning systems
- conversational agents
- intrusion detection
- nuclear power plant