Document clustering as a record linkage problem.
Nikiforos PittarasGeorge GiannakopoulosLeonidas TsekourasIraklis VarlamisPublished in: DocEng (2018)
Keyphrases
- document clustering
- record linkage
- privacy preserving
- duplicate detection
- clustering method
- text mining
- clustering algorithm
- document collections
- data cleaning
- linked data
- document representation
- text documents
- negative matrix factorization
- cluster analysis
- k means
- document clusters
- tolerance rough set
- topic extraction
- ant based clustering
- data sets
- pairwise constraints
- text classification
- databases
- data analysis