Like a Good Nearest Neighbor: Practical Content Moderation and Text Classification.
Luke BatesIryna GurevychPublished in: EACL (1) (2024)
Keyphrases
- text classification
- nearest neighbor
- knn
- k nearest neighbor
- text categorization
- text data
- bag of words
- high dimensional
- multimedia content
- multi label
- training set
- nearest neighbor search
- high dimensional data
- data sets
- naive bayes
- labeled data
- multimedia
- text mining
- decision trees
- data points
- semantic features
- data cleaning
- machine learning
- sentiment analysis
- text classifiers
- user generated content
- nearest neighbor queries
- content analysis
- web content
- n gram
- index structure
- metadata
- feature selection
- real world