Fast Categorization of Web Documents Represented by Graphs.
Alex MarkovMark LastAbraham KandelPublished in: WEBKDD (2006)
Keyphrases
- web documents
- semi structured
- information extraction
- web pages
- document classification
- focused crawling
- web search engines
- web content
- html documents
- keywords
- text categorization
- vector space model
- document representation
- graph databases
- textual information
- link structure
- graph theory
- web data
- structured documents
- data representation
- regular expressions
- automatic classification
- directed graph
- topic specific