A Language for Verification and Manipulation of Web Documents: (Extended Abstract).
Luigi LiquoriFurio HonsellRekha RedamallaPublished in: Electron. Notes Theor. Comput. Sci. (2006)
Keyphrases
- text classification
- extended abstract
- web documents
- document classification
- web pages
- web search engines
- information extraction
- semi structured
- machine learning
- bag of words
- keywords
- textual information
- natural language
- html documents
- web data
- link structure
- web logs
- geographic information
- focused crawling
- vector space model
- database
- web directories
- unstructured documents
- web content
- model checking
- query processing