Lost in Translation: Large Language Models in Non-English Content Analysis.
Gabriel NicholasAliya BhatiaPublished in: CoRR (2023)
Keyphrases
- content analysis
- language model
- cross language retrieval
- statistical machine translation
- translation model
- language modeling
- machine translation
- document retrieval
- cross language
- speech recognition
- probabilistic model
- cross language information retrieval
- information retrieval
- chinese english
- query translation
- n gram
- retrieval model
- machine translation system
- query expansion
- context sensitive
- collaborative learning
- query terms
- statistical language models
- language modelling
- multiword
- test collection
- cross lingual
- video content
- pseudo relevance feedback
- bilingual dictionaries
- out of vocabulary
- online discussion
- ad hoc information retrieval
- language models for information retrieval
- parallel corpora
- question answering
- smoothing methods
- target language
- natural language processing
- natural language
- bayesian networks