Evaluating Topic Modeling Pre-processing Pipelines for Portuguese Texts.
Antônio Pereira De Souza JúniorPablo CecilioFelipe ViegasWashington CunhaElisa Tuler de AlbergariaLeonardo Chaves Dutra da RochaPublished in: WebMedia (2022)
Keyphrases
- topic modeling
- preprocessing
- topic models
- brazilian portuguese
- text documents
- text mining
- latent dirichlet allocation
- modeling framework
- scientific articles
- feature extraction
- text classification
- topic extraction
- latent topics
- probabilistic model
- collaborative filtering
- information retrieval
- document classification
- probabilistic topic models
- text corpora
- keywords
- bayesian networks
- real world
- data mining
- machine translation
- knowledge discovery