A benchmark for toxic comment classification on Civil Comments dataset.
Corentin DucheneHenri JametPierre GuillaumeRéda DehakPublished in: CoRR (2023)
Keyphrases
- pattern recognition
- classification accuracy
- classification systems
- benchmark datasets
- feature extraction
- feature space
- training dataset
- machine learning methods
- decision trees
- classification process
- feature vectors
- classification rate
- object classification
- support vector machine svm
- automatic classification
- text classification
- classification scheme
- fold cross validation
- uci datasets
- user comments
- database
- preprocessing
- support vector
- cost sensitive
- image classification
- feature set
- supervised learning
- multi class
- feature selection
- learning algorithm