Efficient n-gram construction for text categorization using feature selection techniques.
Maximiliano GarcíaSebastián MaldonadoCarla VairettiPublished in: Intell. Data Anal. (2021)
Keyphrases
- text categorization
- n gram
- text classification
- feature selection
- language model
- knn
- multi label
- information gain
- language independent
- text documents
- document classification
- k nearest neighbor
- bag of words
- naive bayes
- labeled data
- text mining
- tf idf
- language modeling
- term frequency
- unlabeled data
- viterbi algorithm
- automatic text categorization
- active learning
- feature selections
- text classifiers
- information retrieval
- semi supervised learning
- nearest neighbor
- knowledge discovery
- search engine