Text Classification Using Compression-Based Dissimilarity Measures.
David Pereira CoutinhoMário A. T. FigueiredoPublished in: Int. J. Pattern Recognit. Artif. Intell. (2015)
Keyphrases
- dissimilarity measure
- text classification
- dissimilarity representation
- similarity measure
- data compression
- feature selection
- bag of words
- machine learning
- clustering method
- text categorization
- compression scheme
- distance function
- distance measure
- compression ratio
- text mining
- knn
- semantic features
- feature space
- distance metric
- image compression
- data cleaning
- n gram
- edit distance
- compression algorithm
- text classifiers
- information retrieval
- pattern recognition
- multi label
- information extraction
- data sets