Continual Pre-Training for Cross-Lingual LLM Adaptation: Enhancing Japanese Language Capabilities.
Kazuki FujiiTaishi NakamuraMengsay LoemHiroki IidaMasanari OhiKakeru HattoriHirai ShotaSakae MizukiRio YokotaNaoaki OkazakiPublished in: CoRR (2024)
Keyphrases
- cross lingual
- japanese language
- machine translation
- cross lingual information retrieval
- event extraction
- language modeling
- text classification
- translation model
- training set
- parallel corpus
- query translation
- news articles
- statistical machine translation
- parallel corpora
- knowledge discovery
- document clustering
- document retrieval
- native speakers
- transfer learning
- supervised learning