Using structural contexts to compress semistructured text collections.
Joaquín AdiegoGonzalo NavarroPablo de la FuentePublished in: Inf. Process. Manag. (2007)
Keyphrases
- semi structured
- text collections
- textual data
- structured data
- information extraction
- text documents
- text categorization
- information retrieval
- text mining
- semistructured databases
- semistructured data
- document collections
- data model
- web documents
- text retrieval
- free text
- inverted index
- semistructured documents
- structural features
- keywords
- high dimensional
- machine learning
- text classification
- co occurrence
- probabilistic model