German Text Embedding Clustering Benchmark.
Silvan WehrliBert ArnrichChristopher IrrgangPublished in: CoRR (2024)
Keyphrases
- clustering method
- clustering algorithm
- text clustering
- k means
- information retrieval
- database
- short text
- text retrieval
- text representation
- data clustering
- outlier detection
- self organizing maps
- keywords
- data hiding
- feature selection
- fuzzy clustering
- hierarchical clustering
- unsupervised learning
- cluster analysis
- information theoretic
- text documents
- data points
- free text
- cross language
- categorical data
- vector space
- high dimensional
- topic detection