Clustering Web Pages to Identify Emerging Textual Patterns.
Marina SantiniPublished in: RECITAL (articles courts) (2005)
Keyphrases
- web pages
- clustering algorithm
- keywords
- previously unknown
- clustering method
- similar patterns
- website
- search engine
- web access logs
- hierarchical clustering
- data mining
- k means
- data records
- web search
- multiple data streams
- categorical data
- web documents
- user sessions
- pattern extraction
- web objects
- cluster analysis
- pattern mining
- dynamically generated
- textual features
- fuzzy clustering
- pattern discovery
- textual information
- clustering quality
- unsupervised learning
- web page classification
- multimedia
- web snippets
- textual contents
- spectral clustering
- data objects
- data clustering
- web content
- web server
- natural language
- metadata