Automated Classification of Web Documents into a Hierarchy of Categories.
Michelangelo CeciFloriana EspositoMichele LapiDonato MalerbaPublished in: IIS (2003)
Keyphrases
- web documents
- automated classification
- hierarchical structure
- web directories
- web pages
- information extraction
- semi structured
- keywords
- document classification
- web search engines
- web content
- classify documents
- vector space model
- focused crawling
- html documents
- web data
- topic specific
- document representation
- data mining
- structured documents
- textual information
- databases
- natural language processing
- relational databases