Clustering web documents using hierarchical representation with multi-granularity.
Faliang HuangShichao ZhangMinghua HeXindong WuPublished in: World Wide Web (2014)
Keyphrases
- web documents
- hierarchical representation
- multi granularity
- content similarity
- semi structured
- information extraction
- k means
- web pages
- multi user
- multiresolution
- dynamic integration
- keywords
- returned by a search engine
- coarse to fine
- document clustering
- privacy protection
- object recognition
- html documents
- computer vision
- self organizing maps
- multi class
- natural language
- image segmentation
- website
- real time