Developing a specialized directory system by automatically classifying Web documents.
Young Mee ChungYoung-Hee NohPublished in: J. Inf. Sci. (2003)
Keyphrases
- web documents
- web directories
- semi structured
- information extraction
- web pages
- document classification
- general purpose
- web search engines
- textual information
- focused crawling
- keywords
- extraction rules
- html documents
- web content
- web data
- content similarity
- vector space model
- document representation
- structured documents
- dynamically generated
- tree structured patterns
- unstructured documents
- topic specific
- geographic information
- link structure
- data representation
- relational databases
- metadata