Clustering de documents dans des collections hétérogènes.
Romaric BesançonAnne-Laure DaquoPublished in: CORIA (2015)
Keyphrases
- document collections
- document clustering
- information retrieval
- data collections
- metadata
- heterogeneous collections
- clustering method
- named entities
- clustering algorithm
- text collections
- cosine similarity
- text clustering
- k means
- document classification
- document retrieval
- text documents
- digital collections
- web documents
- information retrieval systems
- digital libraries
- similar documents
- distributed information retrieval
- xml documents
- document archives
- free text
- hierarchical clustering
- data clustering
- keywords
- digital objects
- document repositories
- spectral clustering
- trec collections
- cluster analysis
- self organizing maps
- machine learning
- data sets