PMIndia - A Collection of Parallel Corpora of Languages of India.
Barry HaddowFaheem KirefuPublished in: CoRR (2020)
Keyphrases
- parallel corpora
- language independent
- cross lingual
- comparable corpora
- machine translation
- cross language information retrieval
- cross lingual information retrieval
- labor intensive
- statistical machine translation
- machine translation system
- bilingual dictionaries
- cross language
- language resources
- document collections
- sentence pairs
- word pairs
- query translation
- linguistic resources
- wikipedia articles
- error prone
- semi automatic
- sentence level
- information retrieval
- language modeling
- information retrieval systems
- natural language processing
- search engine