Unsupervised concept identification from a large corpus of research documents.
Watcharachat PlangsriNalina PhisanbutPunpiti Piamsa-ngaPublished in: KST (2022)
Keyphrases
- newspaper articles
- word frequencies
- person names
- information retrieval
- information retrieval systems
- document collections
- text data
- multiword
- web documents
- supervised learning
- text documents
- text corpus
- document classification
- document analysis
- keywords
- parallel corpus
- document level
- xml documents
- unsupervised learning
- semi supervised
- topic modeling
- document clustering
- similar documents
- historical documents
- natural language
- metadata
- sentence level
- word frequency
- training corpus
- text corpora
- natural language text
- query terms
- language model