Dealing with Sparse Document and Topic Representations: Lab Report for CHiC 2012.
Philipp SchaerDaniel HienertFrank SawitzkiAndias Wira-AlamThomas LükePublished in: CLEF (Online Working Notes/Labs/Workshop) (2012)
Keyphrases
- document content
- topic discovery
- document set
- latent topics
- topic hierarchy
- textual content
- related documents
- high dimensional
- document images
- information retrieval systems
- word clouds
- document collections
- topic models
- sparse representation
- retrieval systems
- document corpus
- automatic summarization
- document level
- document retrieval
- document classification
- text documents
- web documents
- keywords
- information retrieval
- relevant documents
- vector representation
- single document summarization
- short list
- news stories
- document summaries
- scientific papers
- statistical topic models
- relevance ranking
- document analysis
- multi document summarization
- document clustering
- query terms
- test collection
- language model
- probabilistic model
- feature selection