Compression-based distance between string data and its application to literary work classification based on authorship.
Masaki IshikawaHajime KawakamiPublished in: Comput. Stat. (2013)
Keyphrases
- data sets
- data structure
- database
- knowledge discovery
- classification accuracy
- high dimensional data
- raw data
- neural network
- data analysis
- data points
- image compression
- data processing
- data collection
- high quality
- data reduction
- data sources
- classification algorithm
- missing data
- original data
- dimensionality reduction
- string matching
- data distribution
- distance matrix
- image classification
- learning algorithm
- multi class
- databases
- end users
- feature space
- training data
- feature selection