Topic-based document segmentation with probabilistic latent semantic analysis.
Thorsten BrantsFrancine ChenIoannis TsochantaridisPublished in: CIKM (2002)
Keyphrases
- probabilistic latent semantic analysis
- latent topics
- latent semantic
- topic models
- latent dirichlet allocation
- topic modeling
- latent semantic analysis
- text documents
- co occurrence
- image segmentation
- multiscale
- generative model
- visual features
- retrieval systems
- bag of words
- information retrieval
- document set
- semantic information
- probabilistic model
- latent variables
- em algorithm
- negative matrix factorization
- object recognition
- data mining
- web documents
- document retrieval
- query expansion
- tf idf
- information retrieval systems
- computer vision
- keywords