AlbNews: A Corpus of Headlines for Topic Modeling in Albanian.
Erion ÇanoDario LamajPublished in: CoRR (2024)
Keyphrases
- topic modeling
- scientific articles
- text corpora
- topic models
- latent dirichlet allocation
- text classification
- collaborative filtering
- text mining
- lda model
- topic extraction
- modeling framework
- generative model
- text documents
- latent topics
- monolingual and cross lingual
- real world
- hierarchical bayesian model
- probabilistic topic models
- probabilistic latent semantic analysis
- data mining