Topics to Avoid: Demoting Latent Confounds in Text Classification.
Sachin KumarShuly WintnerNoah A. SmithYulia TsvetkovPublished in: EMNLP/IJCNLP (1) (2019)
Keyphrases
- text classification
- text documents
- text data
- topic discovery
- latent topics
- topic modeling
- bag of words
- text mining
- text categorization
- naive bayes
- feature selection
- machine learning
- topic models
- document classification
- knn
- labeled data
- latent variables
- sentiment analysis
- keywords
- information retrieval
- n gram
- databases
- data cleaning
- related topics
- semantic features
- text classifiers
- probabilistic topic models
- data analysis
- multi label
- co occurrence
- text collections
- classification accuracy