Subsampling of Frequent Words in Text for Pre-training a Vision-Language Model.
Mingliang LiangMartha A. LarsonPublished in: LGM3A@MM (2023)
Keyphrases
- language model
- n gram
- multiword
- document level
- information retrieval
- language modeling
- document representation
- document retrieval
- out of vocabulary
- probabilistic model
- translation model
- query expansion
- retrieval model
- keywords
- text retrieval
- speech recognition
- text documents
- word level
- test collection
- dependency structure
- smoothing methods
- ad hoc information retrieval
- statistical language modeling
- word pairs
- context sensitive
- query terms
- bag of words
- mixture model
- text classifiers
- pseudo relevance feedback
- web documents
- text databases
- word error rate
- text classification
- text mining
- relevance model
- word clouds