Mining Clusters in XML Corpora Based on Bayesian Generative Topic Modeling.
Gianni CostaRiccardo OrtalePublished in: ICMLA (2015)
Keyphrases
- topic modeling
- text mining
- text corpora
- topic models
- heterogeneous information networks
- latent dirichlet allocation
- probabilistic topic models
- generative model
- xml documents
- natural language processing
- document clustering
- text data
- bayesian networks
- knowledge discovery
- clustering algorithm
- text documents
- text classification
- latent topics
- lda model
- image classification
- maximum likelihood
- databases
- collaborative filtering
- data mining
- graphical models
- probabilistic model
- data analysis
- artificial intelligence
- probabilistic latent semantic analysis
- machine learning
- real world