Clustering documents with an exponential-family approximation of the Dirichlet compound multinomial distribution.
Charles ElkanPublished in: ICML (2006)
Keyphrases
- exponential family
- closed form
- document clustering
- density estimation
- mixture model
- log likelihood
- clustering algorithm
- document collections
- k means
- statistical models
- information retrieval
- information retrieval systems
- maximum likelihood
- variational methods
- em algorithm
- document retrieval
- information theoretic
- missing values
- order statistics
- graphical models
- level set
- hidden variables
- feature space