Discovery of Maximally Frequent Tag Tree Patterns with Height-Constrained Variables from Semistructured Web Documents.
Yusuke SuzukiTetsuhiro MiyaharaTakayoshi ShoudaiTomoyuki UchidaYasuaki NakamuraPublished in: WIRI (2005)
Keyphrases
- web documents
- semi structured
- semistructured documents
- tree structured patterns
- information extraction
- tree patterns
- web data
- data extraction
- web pages
- semi structured data
- web search engines
- keywords
- structured data
- discovering frequent
- knowledge discovery
- web content
- html documents
- mining frequent
- data mining
- text mining
- web data sources
- xml files
- similarity measure
- semistructured data
- xml databases
- databases
- semistructured databases
- relational databases