A Benchmark Dataset for Multimodal Prediction of Enzymatic Function Coupling DNA Sequences and Natural Language.
Yuchen ZhangRatish Kumar Chandrakant JhaSoumya BharadwajVatsal Sanjaykumar ThakkarAdrienne HoarfrostJin SunPublished in: CoRR (2024)
Keyphrases
- dna sequences
- benchmark datasets
- natural language
- gene prediction
- problems in computational biology
- tandem repeats
- human genome
- dna sequencing
- dna computing
- motif discovery
- coding regions
- binding sites
- databases
- biological sequences
- genomic sequences
- pedestrian detection
- sequence patterns
- gene structure prediction
- protein coding regions
- rna sequences
- information extraction
- data analysis
- machine learning