A little goes a long way: Improving toxic language classification despite data scarcity.
Mika JuutiTommi GröndahlAdrian FlanaganN. AsokanPublished in: EMNLP (Findings) (2020)
Keyphrases
- data analysis
- data sets
- data processing
- knowledge discovery
- data distribution
- data collection
- high quality
- small number
- prior knowledge
- extracted features
- data quality
- machine learning
- image data
- pattern recognition
- statistical analysis
- decision trees
- original data
- synthetic data
- support vector machine svm
- learning algorithm
- training data
- support vector machine
- database
- data structure
- training set
- database systems