WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words.
Lukas WolfGreta TuckuteKlemen KotarEghbal HosseiniTamar RegevEthan WilcoxAlex WarstadtPublished in: CoRR (2023)
Keyphrases
- language modeling
- n gram
- language model
- information retrieval
- text documents
- statistical language modeling
- audio visual
- word segmentation
- retrieval model
- multiword
- multimedia
- cross lingual
- document level
- query expansion
- text classification
- keywords
- probabilistic model
- anchor text
- text retrieval
- word pairs
- text mining
- relevance model
- translation model
- word level
- text corpora
- multi modal
- retrieval effectiveness
- bag of words
- document representation
- language independent
- search engine
- test collection
- web documents
- text categorization
- information retrieval systems
- tf idf
- document retrieval
- speech recognition
- machine translation system
- digital libraries
- similarity measure