Using Topic Modeling for Code Discovery in Large Scale Text Data.
Zhiqiang CaiAmanda Siebert-EvenstoneBrendan R. EaganDavid Williamson ShafferPublished in: ICQE (2020)
Keyphrases
- topic modeling
- text data
- text classification
- text mining
- text documents
- topic models
- knowledge discovery
- text categorization
- latent dirichlet allocation
- data mining
- bag of words
- real world
- machine learning
- high dimensional
- latent topics
- feature selection
- n gram
- natural language processing
- information retrieval
- structured data
- document collections
- named entities
- collaborative filtering
- unlabeled data
- pairwise
- labeled data
- wordnet
- data analysis
- document clustering
- knn
- database
- information extraction