Similarity of Documents and Document Collections using Attributes with Low Noise.
Chris BiemannUwe QuasthoffPublished in: WEBIST (2) (2007)
Keyphrases
- document collections
- information retrieval systems
- document retrieval
- information retrieval
- document clustering
- relevant documents
- test collection
- document representation
- text retrieval
- index terms
- document archives
- digital libraries
- similar documents
- text collections
- similarity measure
- cross language
- ad hoc retrieval
- query terms
- document space
- document clusters
- text corpora
- xml retrieval
- document set
- text data
- data collections
- cosine similarity
- topic detection
- scatter gather
- relevance ranking
- retrieval systems
- semantic similarity
- automatic document classification