Mining business topics in source code using latent dirichlet allocation.
Girish Maskeri RamaSantonu SarkarKenneth HeafieldPublished in: ISEC (2008)
Keyphrases
- source code
- latent dirichlet allocation
- topic models
- software repositories
- text mining
- topic modeling
- open source
- mining software repositories
- generative model
- topic discovery
- legacy systems
- latent topics
- software systems
- data mining
- lda model
- software projects
- gibbs sampling
- software evolution
- software maintenance
- business processes
- high level
- word counts
- knowledge discovery
- business rules
- text documents
- dimensionality reduction
- program understanding
- probabilistic topic models
- machine learning
- source files
- legacy software
- real world
- databases