Locality sensitive hashing for scalable structural classification and clustering of web documents.
Christian HachenbergThomas GottronPublished in: CIKM (2013)
Keyphrases
- web documents
- locality sensitive hashing
- web pages
- information extraction
- decision trees
- image classification
- machine learning
- keywords
- nearest neighbor
- database
- feature selection
- feature vectors
- co occurrence
- feature extraction
- text classification
- high dimensional data
- similarity search
- hash functions
- space efficient
- databases