Web Documents Categorization using Fuzzy Representation and HAC.
Jiawei DengLihui ChenPublished in: WISE (2) (2000)
Keyphrases
- web documents
- information extraction
- web search engines
- fuzzy sets
- web pages
- keywords
- link structure
- web content
- semi structured
- document classification
- web data
- textual information
- document representation
- unstructured documents
- focused crawling
- html documents
- vector space model
- text categorization
- structured documents
- image representation
- search engine
- content similarity
- machine learning