RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining.
Alexander YaluninAlexander NesterovDmitriy UmerenkovPublished in: CoRR (2022)
Keyphrases
- language model
- biomedical text mining
- pre trained
- text mining
- language modeling
- document retrieval
- information retrieval
- probabilistic model
- semi automated
- n gram
- speech recognition
- query expansion
- retrieval model
- mixture model
- training data
- ad hoc information retrieval
- natural language
- context sensitive
- query terms
- training examples
- smoothing methods
- pseudo relevance feedback
- cross language retrieval
- machine learning
- word sense disambiguation
- face recognition
- translation model
- control signals
- small number
- feature selection
- information extraction
- knowledge discovery