Detecting Topics in Documents by Clustering Word Vectors.
Guilherme Raiol de MirandaRodrigo PastiLeandro Nunes de CastroPublished in: DCAI (2019)
Keyphrases
- latent topics
- document clustering
- keywords
- topic discovery
- topic detection
- information retrieval
- document space
- topic models
- text documents
- statistical topic models
- clustering algorithm
- stop words
- latent dirichlet allocation
- document set
- topic modeling
- k means
- clustering method
- text clustering
- vector space
- distributional clustering
- related documents
- word frequencies
- document collections
- word spotting
- document clusters
- cluster labels
- newspaper articles
- natural language text
- word pairs
- text collections
- text corpus
- text corpora
- information retrieval systems
- word frequency
- web documents
- related words
- co occurrence
- concept space
- multiword
- spoken documents
- binary vectors
- text classification
- text mining
- relevant documents
- xml documents
- writing style
- page layout
- term frequency
- document representation
- vector space model
- text data
- n gram
- word recognition
- relevance assessments
- document analysis
- key concepts
- document retrieval
- search engine
- handwritten documents
- tf idf
- test collection