Synchronizing Web Documents with Style.
Rodrigo Laiola GuimarãesDick C. A. BultermanPablo CésarJack JansenPublished in: WebMedia (2014)
Keyphrases
- web documents
- information extraction
- web pages
- semi structured
- keywords
- web search engines
- textual information
- html documents
- document classification
- document representation
- search engine
- focused crawling
- vector space model
- web content
- natural language processing
- databases
- web directories
- dynamically generated
- content similarity