Exploiting Parallel Corpora to Improve Multilingual Embedding based Document and Sentence Alignment.
Dilan SachinthaLakmali PiyarathnaCharith RajithaSurangika RanathungaPublished in: CoRR (2021)
Keyphrases
- sentence pairs
- parallel corpora
- cross language information retrieval
- word level
- language independent
- cross lingual
- parallel corpus
- language resources
- cross language
- information retrieval
- machine translation system
- document collections
- sentence level
- cross lingual information retrieval
- document retrieval
- comparable corpora
- parallel texts
- bilingual dictionaries
- document clustering
- text documents
- relevant documents
- co occurrence
- labor intensive
- digital libraries
- source language
- text retrieval
- machine translation
- semantic information
- web documents