"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks.
Edoardo MoscaShreyash AgarwalJavier Rando-RamirezGeorg GrohPublished in: ACL (1) (2022)
Keyphrases
- detecting malicious
- detect malicious
- natural language processing
- anomaly detection
- information extraction
- normal behavior
- detection method
- network attacks
- machine translation
- multi agent
- intrusion detection
- wordnet
- question answering
- countermeasures
- language processing
- co occurrence
- natural language
- misuse detection
- machine learning
- network traffic
- network intrusion detection
- computer security
- normal traffic
- artificial intelligence