Comparative Document Analysis for Large Text Corpora.
Xiang RenYuanhua LvKuansan WangJiawei HanPublished in: WSDM (2017)
Keyphrases
- document analysis
- text corpora
- text analysis
- text mining
- computational linguistics
- document images
- document collections
- image analysis
- character recognition
- text collections
- topic models
- text documents
- information extraction
- text classifiers
- natural language processing
- concept hierarchy
- text classification
- machine learning
- knowledge discovery
- query processing
- data structure
- knowledge base
- feature selection
- computer vision
- information retrieval