Human in the Loop: How to Effectively Create Coherent Topics by Manually Labeling Only a Few Documents per Class.
Anton Frederik ThielmannChristoph WeisserBenjamin SäfkenPublished in: LREC/COLING (2024)
Keyphrases
- information retrieval
- keywords
- text documents
- topic modeling
- highly relevant
- topic discovery
- newspaper articles
- topic models
- related topics
- latent topics
- information retrieval systems
- human subjects
- relevant documents
- topic detection
- document collections
- manually constructed
- document clustering
- web documents
- manually generated
- labor intensive
- document set
- text collections
- document retrieval
- topic hierarchy
- document classification
- relevance assessments
- user interests
- automatically generated
- manual labeling
- key concepts
- wikipedia pages
- human relevance judgments
- topic specific
- wikipedia articles
- latent dirichlet allocation
- retrieval systems
- xml documents
- image segmentation
- metadata