Taking an object-oriented approach to restructuring legacy documents for the web.
Jonathan PricePublished in: SIGDOC (2001)
Keyphrases
- web documents
- web data
- multilingual documents
- web information
- digital documents
- website
- web content
- information retrieval
- open directory project
- web applications
- web pages
- electronic documents
- web resources
- information sources
- database
- current web search engines
- newspaper articles
- document collections
- document retrieval
- relevant documents
- xml documents
- link analysis
- multimedia documents
- textual features
- textual data
- content similarity
- desired information
- text information
- web crawler
- document classification
- linked data
- semantic web
- information retrieval systems
- search engine
- web search engines
- web mining
- focused crawling
- legacy systems
- web directories
- topic specific
- structured information
- keywords
- user generated content
- document representation
- user interests
- ranked list
- web search