"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks.
Edoardo MoscaShreyash AgarwalJavier Rando-RamirezGeorg GrohPublished in: CoRR (2022)
Keyphrases
- detecting malicious
- detect malicious
- natural language processing
- information extraction
- network attacks
- natural language
- detection method
- detection algorithm
- question answering
- text analysis
- web pages
- network traffic
- language processing
- traffic analysis
- text mining
- machine translation
- network security
- normal behavior
- anomaly detection
- normal traffic
- malicious activities