A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.
David WestergaardHans Henrik StærfeldtChristian TønsbergLars Juhl JensenSøren BrunakPublished in: PLoS Comput. Biol. (2018)
Keyphrases
- text mining
- journal articles
- scientific literature
- biomedical literature
- information extraction
- textual documents
- digital libraries
- medical subject headings
- data mining
- medline abstracts
- metadata
- statistical analysis
- web mining
- text corpora
- text classification
- natural language processing
- text documents
- document clustering
- medical literature
- information retrieval
- information retrieval systems
- neural network
- machine learning
- news corpus
- text categorisation
- newspaper articles
- high quality
- case based reasoning
- relation extraction
- life sciences
- textual data
- latent dirichlet allocation
- qualitative and quantitative
- databases
- named entities