MoDNA: motif-oriented pre-training for DNA language model.
Weizhi AnYuzhi GuoYatao BianHehuan MaJinyu YangChunyuan LiJunzhou HuangPublished in: BCB (2022)
Keyphrases
- language model
- language modeling
- dna sequences
- n gram
- document retrieval
- speech recognition
- probabilistic model
- binding sites
- information retrieval
- retrieval model
- biological sequences
- language modelling
- smoothing methods
- mixture model
- query expansion
- context sensitive
- training set
- ad hoc information retrieval
- test collection
- query terms
- statistical language models
- language models for information retrieval
- language model for information retrieval
- document ranking
- motif discovery
- supervised learning
- word clouds
- bayesian networks
- query specific
- document length
- vector space model