Using Hybrid Methods and 'Core Documents' for the Representation of Clusters and Topics: The Astronomy Dataset.
Wolfgang GlänzelBart ThijsPublished in: ISSI (2015)
Keyphrases
- document clustering
- document clusters
- information retrieval
- document set
- text documents
- clustering algorithm
- topic modeling
- document corpus
- metadata
- latent topics
- web documents
- keywords
- topic models
- relevant documents
- document collections
- newspaper articles
- scientific data
- information retrieval systems
- keyphrases
- topic discovery
- related documents
- high dimensional datasets
- topic hierarchy
- synthetic datasets
- text data
- topic detection
- clustering method
- document representation
- highly relevant
- semantic space
- latent dirichlet allocation
- document retrieval
- image representation
- data points
- xml documents