PMC text mining subset in BioC: 2.3 million full text articles and growing.
Donald C. ComeauChih-Hsuan WeiRezarta Islamaj DoganZhiyong LuPublished in: CoRR (2018)
Keyphrases
- text mining
- journal articles
- scientific literature
- text documents
- information retrieval systems
- textual documents
- information extraction
- news articles
- biomedical literature
- information retrieval
- digital libraries
- natural language processing
- news corpus
- knowledge discovery
- data analysis
- retrieval systems
- named entities
- text classification
- data mining
- web mining
- sentiment analysis
- textual data
- probabilistic topic models
- text categorisation
- medical subject headings
- link analysis
- document clustering
- real world
- automatic extraction
- topic modeling
- transfer learning
- keywords
- high quality
- citation analysis
- metadata
- bibliographic data
- neural network
- database