Mixed Orthographic/Phonemic Language Modeling: Beyond Orthographically Restricted Transformers (BORT).
Robert GaleAlexandra SalemGerasimos FergadiotisSteven BedrickPublished in: RepL4NLP@ACL (2023)
Keyphrases
- language modeling
- language model
- information retrieval
- retrieval model
- query expansion
- cross lingual
- n gram
- probabilistic model
- text classification
- statistical language models
- test collection
- word segmentation
- trec collections
- sentence retrieval
- improvements in retrieval effectiveness
- speech recognition
- information extraction
- vector space model
- document length