An Architecture for Efficient Document Clustering and Retrieval on a Dynamic Collection of Newspaper Texts.
Alan F. SmeatonMark BurnettFrancis CrimminsGerard QuinnPublished in: BCS-IRSG Annual Colloquium on IR Research (1998)
Keyphrases
- relevance feedback
- document clustering
- document collections
- information retrieval systems
- text documents
- document retrieval
- information retrieval
- image retrieval
- query expansion
- terminology extraction
- document clusters
- automatic categorization
- k means
- pseudo relevance feedback
- document representation
- clustering algorithm
- text mining
- data mining
- text retrieval
- vector space model
- document set
- test collection
- document corpus
- document similarity
- topic detection
- text categorization
- keywords
- tf idf
- artificial intelligence
- relevant documents
- text collections
- related documents
- document classification
- image database
- information extraction