Private federated discovery of out-of-vocabulary words for Gboard.
Ziteng SunPeter KairouzHaicheng SunAdrià GascónAnanda Theertha SureshPublished in: CoRR (2024)
Keyphrases
- out of vocabulary
- n gram
- language model
- word segmentation
- spoken document retrieval
- cross language information retrieval
- named entity recognition
- broadcast news
- query words
- named entities
- query terms
- parallel corpora
- term frequency
- hand crafted
- cross lingual
- digital libraries
- previously unseen
- machine translation
- word level
- word recognition
- information retrieval
- knowledge discovery
- text classification
- retrieval model
- text documents
- information extraction
- natural language processing
- cross language
- language modeling
- document retrieval
- search engine
- speech recognition