Influence of various text embeddings on clustering performance in NLP.
Rohan SahaPublished in: CoRR (2023)
Keyphrases
- text mining
- text analysis
- free text
- text clustering
- natural language processing
- high dimensional data
- clustering method
- clustering algorithm
- textual entailment
- k means
- text processing
- computational linguistics
- cluster analysis
- text retrieval
- categorical data
- broad coverage
- self organizing maps
- document clustering
- text documents
- vector space
- low dimensional
- subspace clustering
- information extraction
- information retrieval
- hierarchical clustering
- linguistic analysis
- lexical semantics
- statistical natural language processing
- dimensionality reduction
- data points
- natural language
- database
- lexical resources
- unsupervised learning
- keywords
- data clustering
- knowledge representation
- data sets
- data mining
- artificial intelligence
- principal component analysis
- text summarization
- wordnet
- fuzzy clustering
- web documents
- question answering