Login / Signup

Exploiting Explainability to Design Adversarial Attacks and Evaluate Attack Resilience in Hate-Speech Detection Models.

Pranath Reddy KumbamSohaib Uddin SyedPrashanth ThamminediSuhas HarishIan PereraBonnie J. Dorr
Published in: CoRR (2023)
Keyphrases