Discovering unexpected documents in corpora.
François JacquenetChristine LargeronPublished in: Knowl. Based Syst. (2009)
Keyphrases
- text corpora
- information retrieval
- document collections
- information retrieval systems
- topic segmentation
- text data
- word frequency
- legal documents
- document retrieval
- xml documents
- data collections
- text documents
- text corpus
- text collections
- vector space model
- natural language processing
- metadata
- free text
- document classification
- topic modeling
- parallel corpora
- document corpus
- parallel corpus
- events occur
- web documents
- database
- relevant documents
- document analysis
- latent semantic analysis
- term frequency
- automatic summarization
- document clustering
- retrieval systems
- user queries
- semantic information