A Document Clustering Algorithm Based on Semi-constrained Hierarchical Latent Dirichlet Allocation.
Jungang XuShilong ZhouLin QiuShengyuan LiuPengfei LiPublished in: KSEM (2014)
Keyphrases
- latent dirichlet allocation
- latent topics
- topic discovery
- clustering algorithm
- topic models
- lda model
- document clustering
- document similarity
- topic extraction
- latent dirichlet
- topic modeling
- text documents
- latent semantic analysis
- generative model
- hierarchical dirichlet process
- text mining
- statistical topic models
- gibbs sampling
- variational inference
- hierarchical bayesian model
- document classification
- variational bayesian inference
- information retrieval
- document collections
- probabilistic latent semantic analysis
- cluster analysis
- dirichlet process
- k means
- generative process
- hierarchical bayesian
- tf idf
- image classification
- markov chain
- co occurrence
- hidden markov models
- em algorithm
- prior knowledge
- text corpora
- latent topic model
- concept hierarchy
- machine learning
- latent topic models
- data mining