Do Llamas Work in English? On the Latent Language of Multilingual Transformers.
Chris WendlerVeniamin VeselovskyGiovanni MoneaRobert WestPublished in: CoRR (2024)
Keyphrases
- language specific
- parallel corpus
- language resources
- natural language
- comparable corpora
- english language
- cross lingual
- machine translation
- language independent
- language learning
- cross language information retrieval
- indian languages
- cross language
- machine translation system
- target language
- english text
- bilingual dictionaries
- native language
- cross lingual information retrieval
- parallel corpora
- n gram
- multilingual information retrieval
- multilingual documents
- spoken language
- language processing
- query translation
- chinese english
- language proficiency
- linguistic resources
- multilingual retrieval
- digital libraries
- linguistic analysis
- text to speech
- source language
- latent variables
- out of vocabulary
- word level
- language technology
- linguistic knowledge
- wide coverage
- information access
- language modeling
- query expansion
- cross language ir