A Binary-Categorization Approach for Classifying Multiple-Record Web Documents Using Application Ontologies and a Probabilistic Model.
Yiu-Kai NgJune TangMichael A. GoodrichPublished in: DASFAA (2001)
Keyphrases
- web documents
- probabilistic model
- semi structured
- information extraction
- web search engines
- web pages
- keywords
- vector space model
- bayesian networks
- automatic classification
- semantic web
- textual information
- domain specific
- semi automatic
- machine learning
- web data
- data representation
- document classification
- document representation
- focused crawling
- social annotations
- unstructured documents