TopCat: Data Mining for Topic Identification in a Text Corpus.
Chris CliftonRobert CooleyPublished in: PKDD (1999)
Keyphrases
- text corpus
- text corpora
- data mining
- text mining
- topic models
- text documents
- knowledge discovery
- named entities
- topic modeling
- machine learning
- data analysis
- wikipedia articles
- rough sets
- association rules
- knowledge base
- web mining
- digital libraries
- text analysis
- text classification
- text collections
- computational linguistics
- probabilistic model
- information retrieval
- training corpus