Align after Pre-train: Improving Multilingual Generative Models with Cross-lingual Alignment.
Chong LiShaonan WangJiajun ZhangChengqing ZongPublished in: CoRR (2023)
Keyphrases
- cross lingual
- generative model
- word alignment
- language modeling
- machine translation
- probabilistic model
- cross lingual information retrieval
- language independent
- discriminative models
- cross language
- discriminative learning
- parallel corpus
- text classification
- prior knowledge
- em algorithm
- news articles
- semi supervised
- conditional random fields
- language model
- translation model
- statistical machine translation
- document clustering
- latent dirichlet allocation
- machine translation system
- retrieval model
- cross language information retrieval
- information retrieval
- query expansion
- co occurrence
- n gram
- maximum likelihood
- parallel corpora
- natural language processing
- information extraction
- image segmentation