Never the Same Stream: netomat, XLink, and Metaphors of Web Documents.
Colin PostPatrick GoldenRyan ShawPublished in: DocEng (2018)
Keyphrases
- web documents
- semi structured
- data streams
- information extraction
- web pages
- web search engines
- keywords
- link structure
- web content
- textual information
- web data
- focused crawling
- document classification
- html documents
- content similarity
- dynamically generated
- database
- vector space model
- structured documents
- domain specific
- similarity measure
- database systems
- search engine