Extracting Body Text from Academic PDF Documents for Text Mining.
Changfeng YuCheng ZhangJie WangPublished in: CoRR (2020)
Keyphrases
- text mining
- text documents
- pdf documents
- text data
- scientific documents
- textual documents
- text analytics
- textual data
- text processing
- text classification
- information retrieval
- information extraction
- natural language processing
- scientific literature
- knowledge discovery
- data mining
- machine learning
- text clustering
- biomedical literature
- text corpora
- document clustering
- computational linguistics
- named entities
- data analysis
- topic models
- text categorization
- query processing
- keywords
- multimedia
- databases