FacetGist: Collective Extraction of Document Facets in Large Technical Corpora.
Tarique SiddiquiXiang RenAditya G. ParameswaranJiawei HanPublished in: CIKM (2016)
Keyphrases
- document corpus
- information retrieval
- document collections
- document classification
- keywords
- text collections
- retrieval systems
- document clustering
- information extraction
- document images
- natural language processing
- word frequency
- document structure
- text corpus
- text documents
- document retrieval
- document analysis
- automatic extraction
- database
- web documents
- information retrieval systems
- document representation
- collective intelligence
- user queries
- text corpora
- language model
- domain specific
- cf loadingtexthtml
- web pages
- bilingual lexicon