Hierarchies in HTML Documents: Linking Text to Concepts.
Radek BurgetPublished in: DEXA Workshops (2004)
Keyphrases
- html documents
- web documents
- semantic information
- web page retrieval
- structured documents
- information retrieval
- text mining
- web pages
- semistructured data
- automatic extraction
- domain ontology
- repeated patterns
- semi structured
- information extraction
- database
- xml documents
- semantic relatedness
- keywords
- website
- image classification
- web content
- topic maps
- text data
- document representation
- probabilistic model
- data mining