Document clustering in heterogeneous corpora.
Romaric BesançonAnne-Laure DaquoPublished in: Document Numérique (2015)
Keyphrases
- document clustering
- document corpus
- text mining
- terminology extraction
- clustering algorithm
- text documents
- document collections
- clustering method
- document representation
- topic extraction
- negative matrix factorization
- document clusters
- topic detection
- vector space model
- cluster analysis
- real world
- natural language processing
- k means
- named entities
- clustering approaches
- text data
- pairwise constraints
- co occurrence
- knowledge base