Categorisation of web documents using extraction ontologies.
Li XuDavid W. EmbleyPublished in: Int. J. Metadata Semant. Ontologies (2008)
Keyphrases
- web documents
- information extraction
- semi structured
- web pages
- information integration
- web directories
- vector space model
- automatic extraction
- web search engines
- semantic web
- html documents
- document classification
- web data
- data extraction
- topic specific
- keywords
- wrapper induction
- textual information
- web content
- semantic association
- knowledge base
- semantic relations
- natural language processing
- structured documents
- background knowledge
- domain ontology
- web resources
- document representation
- text mining
- relational databases
- focused crawling
- social annotations
- machine learning
- structured data
- domain specific
- web mining
- active learning
- database
- web information extraction
- information retrieval