Spanish Biomedical Crawled Corpus: A Large, Diverse Dataset for Spanish Biomedical Language Models.
Casimiro Pio CarrinoJordi Armengol-EstapéOna De Gibert BonetAsier Gutiérrez-FandiñoAitor Gonzalez-AgirreMartin KrallingerMarta VillegasPublished in: CoRR (2021)
Keyphrases
- language model
- spanish language
- language modeling
- n gram
- document retrieval
- machine translation system
- statistical machine translation
- speech recognition
- language modelling
- probabilistic model
- retrieval model
- text mining
- information retrieval
- statistical language models
- test collection
- multiword
- ad hoc information retrieval
- query terms
- document level
- question answering
- query expansion
- pseudo relevance feedback
- cross language
- relevance model
- language model for information retrieval