A Multilingual Parallel Corpora Collection Effort for Indian Languages.
Shashank SiripragadaJerin PhilipVinay P. NamboodiriC. V. JawaharPublished in: CoRR (2020)
Keyphrases
- parallel corpora
- cross lingual information retrieval
- cross lingual
- indian languages
- machine translation
- language independent
- comparable corpora
- cross language information retrieval
- cross language
- chinese english
- language modeling
- machine translation system
- query translation
- document images
- document clustering
- parallel corpus
- multi lingual
- bilingual dictionaries
- document collections
- labor intensive
- word pairs
- statistical machine translation
- word segmentation
- information extraction
- transfer learning
- translation model
- sentence level
- target language
- news articles
- language model