Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean.
ChangSu ChoiYongbin JeongSeoyoon ParkInho WonHyeonSeok LimSangmin KimYejee KangChanhyuk YoonJaewan ParkYiseul LeeHyejin LeeYounggyun HahmHansaem KimKyungtae LimPublished in: CoRR (2024)
Keyphrases
- language model
- language modeling
- machine translation system
- n gram
- comparable corpora
- cross lingual
- translation model
- probabilistic model
- information retrieval
- retrieval model
- document retrieval
- speech recognition
- query expansion
- parallel corpus
- test collection
- statistical machine translation
- context sensitive
- language modelling
- statistical language models
- natural language
- language independent
- vector space model
- document ranking
- ad hoc information retrieval
- smoothing methods
- relevance model
- pseudo relevance feedback
- document length
- digital libraries
- cross language retrieval
- language model for information retrieval
- hidden markov models
- linguistic resources
- query terms
- bilingual dictionaries
- cross language
- retrieval systems
- query specific
- language models for information retrieval