Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus.
Donald C. ComeauHaibin LiuRezarta Islamaj DoganW. John WilburPublished in: Database J. Biol. Databases Curation (2014)
Keyphrases
- natural language processing
- broad coverage
- metadata
- reference resolution
- text collections
- text mining
- information extraction
- natural language
- computational linguistics
- coreference resolution
- wordnet
- machine learning
- textual data
- document collections
- manually annotated
- question answering
- text corpora
- information retrieval
- semantic relations
- named entity recognition
- data sets
- knowledge representation
- text documents
- free text
- relation extraction
- semantic annotation
- disease diagnosis
- ontology learning
- data collections
- semantic analysis
- machine translation
- text processing
- big data
- computational biology
- text summarization
- knowledge base
- web pages