ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.
Le ZhuoZewen ChiMinghao XuHeyan HuangHeqi ZhengConghui HeXian-Ling MaoWentao ZhangPublished in: CoRR (2024)
Keyphrases
- protein structure
- protein sequences
- amino acids
- subcellular localization
- protein folding
- protein structure prediction
- molecular dynamics
- amino acid sequences
- word order
- linguistic knowledge
- training corpus
- tandem mass spectra
- drug design
- sequence analysis
- natural language
- protein interaction
- word sense disambiguation
- language learning
- programming language