Beyond Words: A Topological Exploration of Coherence in Text Documents.
Samyak JainRishi SinghalSriram KrishnaYaman Kumar SinglaRajiv Ratn ShahPublished in: Tiny Papers @ ICLR (2024)
Keyphrases
- text documents
- text mining
- information extraction
- text categorization
- keywords
- text classification
- news articles
- wordnet
- bag of words
- document clustering
- topic models
- document classification
- text corpus
- text data
- named entities
- document representation
- tf idf
- text corpora
- feature selection
- latent topics
- n gram
- text representation
- training documents
- text collections
- automatic text categorization
- computer vision
- co occurrence
- natural language processing
- web pages
- data mining