SHDC: A Fast Documents Classification Method Based on Simhash.
Liang GuPeng YangYongqiang DongPublished in: ICA3PP (2) (2015)
Keyphrases
- classification method
- k nearest neighbor
- knn
- text classification
- document collections
- classification scheme
- support vector machine svm
- information retrieval
- classification algorithm
- web documents
- support vector machine
- information retrieval systems
- relevant documents
- text documents
- document retrieval
- xml documents
- metadata
- document clustering
- clustering algorithm
- learning algorithm
- real world
- pairwise
- database
- pattern recognition
- keywords
- feature extraction
- decision trees
- retrieval systems