Comparative Document Analysis for Large Text Corpora.
Xiang RenYuanhua LvKuansan WangJiawei HanPublished in: CoRR (2015)
Keyphrases
- document analysis
- text corpora
- text analysis
- text mining
- document images
- character recognition
- image analysis
- document collections
- text documents
- topic models
- natural language processing
- information extraction
- topic modeling
- computational linguistics
- text collections
- data mining
- concept hierarchy
- text classifiers
- image processing
- web pages