Topic Modeling by Clustering Language Model Embeddings: Human Validation on an Industry Dataset.
Anton EklundMona ForsmanPublished in: EMNLP (Industry Track) (2022)
Keyphrases
- language model
- topic modeling
- topic models
- language modeling
- language modeling framework
- probabilistic model
- information retrieval
- latent dirichlet allocation
- n gram
- probabilistic latent semantic analysis
- text mining
- clustering method
- clustering algorithm
- retrieval model
- document retrieval
- query expansion
- mixture model
- k means
- text classification
- data points
- high dimensional data
- dimensionality reduction
- test collection
- vector space model
- query terms
- relevance model
- pseudo relevance feedback
- vector space
- unsupervised learning
- cross lingual
- text documents
- smoothing methods
- support vector
- cluster analysis
- knn
- knowledge discovery
- semi supervised
- collaborative filtering
- generative model