No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling.
Marília Costa Rosendo SilvaFelipe Alves SiqueiraJoão Pedro Mantovani TarregaJoão Vitor Pataca BeinottiAugusto Sousa NunesMiguel de Mattos GardiniVinícius Adolfo Pereira da SilvaNádia Félix Felipe da SilvaAndré Carlos Ponce de Leon Ferreira de CarvalhoPublished in: CoRR (2022)
Keyphrases
- topic modeling
- text clustering
- text mining
- text documents
- topic models
- text classification
- latent semantic analysis
- document clustering
- text data
- collaborative filtering
- latent dirichlet allocation
- text categorization
- background knowledge
- object recognition
- clustering algorithm
- information extraction
- data mining
- document representation
- pattern recognition
- named entities
- hierarchical clustering
- k means
- wordnet
- data analysis
- keywords
- unsupervised learning
- feature extraction
- action recognition
- neural network
- similarity measure
- knowledge discovery
- search engine