An Empirical Analysis of Topic Models: Uncovering the Relationships between Hyperparameters, Document Length and Performance Measures.
Silvia TerragniElisabetta FersiniPublished in: RANLP (2021)
Keyphrases
- topic models
- hyperparameters
- model selection
- topic modeling
- latent dirichlet allocation
- bayesian inference
- random sampling
- bayesian framework
- cross validation
- support vector
- probabilistic model
- latent variables
- posterior distribution
- em algorithm
- generative model
- prior information
- closed form
- sample size
- noise level
- gaussian process
- text documents
- text mining
- maximum a posteriori
- co occurrence
- maximum likelihood
- incomplete data
- missing values
- information retrieval
- prior knowledge
- language model
- active learning
- scoring function
- smoothing methods
- retrieval systems
- mixture model
- training data
- image segmentation