ProGene - A Large-scale, High-Quality Protein-Gene Annotated Benchmark Corpus.
Erik FaesslerLuise ModersohnChristina LohrUdo HahnPublished in: LREC (2020)
Keyphrases
- high quality
- manually annotated
- annotated corpus
- sequence alignment
- ground truth
- real world
- regulatory networks
- cellular processes
- protein protein interaction networks
- gene prediction
- microarray
- interaction networks
- genomic sequences
- homo sapiens
- amino acids
- gene expression
- protein sequences
- signaling pathways
- protein interaction
- dna binding
- biological entities
- genia corpus
- protein folding
- relation extraction
- saccharomyces cerevisiae
- protein structure
- protein function
- sequence analysis
- experimental conditions
- gene ontology
- gene clusters
- gene function
- protein structure prediction