Sparse Victory - A Large Scale Systematic Comparison of count-based and prediction-based vectorizers for text classification.
Rupak ChakrabortyAshima ElhenceKapil AroraPublished in: RANLP (2019)
Keyphrases
- text classification
- bag of words
- machine learning
- text categorization
- text documents
- prediction accuracy
- sparse data
- text data
- prediction error
- prediction model
- naive bayes
- n gram
- text mining
- high dimensional
- feature selection
- real world
- databases
- probabilistic model
- sparse coding
- information retrieval
- semantic features
- neural network