Keyphrases
- html documents
- semantic information
- web documents
- web page retrieval
- automatic extraction
- web pages
- semi structured
- structured documents
- web content
- databases
- metadata
- semistructured data
- topic maps
- information extraction
- vector space model
- semantic features
- xml documents
- keywords
- information retrieval
- wordnet
- low level
- machine learning
- repeated patterns