Extracting Hyponyms of Prespecified Hypernyms from Itemizations and Headings in Web Documents.
Keiji ShinzatoKentaro TorisawaPublished in: COLING (2004)
Keyphrases
- web documents
- semi structured
- information extraction
- web search engines
- web pages
- wordnet
- keywords
- document classification
- web data
- vector space model
- data extraction
- focused crawling
- web content
- html documents
- textual information
- unstructured documents
- structured documents
- website
- web logs
- document representation
- information retrieval systems
- content similarity