A Declarative-Friendly API for Web Document Manipulation.
Benjamin CanouEmmanuel ChaillouxVincent BalatPublished in: PADL (2013)
Keyphrases
- web documents
- information extraction
- high level
- web pages
- web search engines
- keywords
- focused crawling
- web content
- semi structured
- declarative language
- application programming interface
- prefetching
- textual information
- open source
- html documents
- dynamically generated
- friendly interface
- source code
- vector space model
- web logs
- application developers
- knowledge representation
- web data
- document representation
- third party
- domain independent
- code snippets