An Integrated System of Mining HTML Texts and Filtering Structured Documents.
Bo-Hyun YunMyungeun LimSoo-Hyun ParkPublished in: PAKDD (2003)
Keyphrases
- structured documents
- html documents
- document structure
- structured document retrieval
- information retrieval systems
- data mining
- text mining
- xml documents
- knowledge discovery
- information extraction
- electronic documents
- relevant documents
- web documents
- information retrieval
- digital libraries
- web pages
- database
- data mining techniques
- data analysis
- document representation
- keywords
- learning algorithm
- databases