Pagerank based clustering of hypertext document collections.
Konstantin AvrachenkovVladimir DobryninDanil NemirovskyKim Son PhamElena SmirnovaPublished in: SIGIR (2008)
Keyphrases
- document collections
- document clustering
- information retrieval systems
- topic detection
- information retrieval
- document retrieval
- clustering method
- document clusters
- k means
- text retrieval
- clustering algorithm
- text clustering
- test collection
- scatter gather
- relevant documents
- digital libraries
- cross language
- random walk
- web search
- ad hoc retrieval
- text collections
- document archives
- geographic information retrieval
- xml retrieval
- cluster analysis
- text data
- image retrieval
- document representation
- data collections
- ranking algorithm
- text documents
- high dimensional data
- web pages
- search engine