Exploiting bilingual lexicons to improve multilingual embedding-based document and sentence alignment for low-resource languages.
Aloka FernandoSurangika RanathungaDilan SachinthaLakmali PiyarathnaCharith RajithaPublished in: Knowl. Inf. Syst. (2023)
Keyphrases
- sentence pairs
- cross lingual
- parallel corpus
- cross language information retrieval
- language independent
- source language
- word level
- parallel corpora
- comparable corpora
- parallel texts
- machine translation
- multilingual information retrieval
- bilingual dictionaries
- machine translation system
- text summarization
- target language
- word alignment
- query translation
- cross language
- sentiment classification
- language resources
- multilingual documents
- cross lingual information retrieval
- multilingual retrieval
- document level
- multi lingual
- indian languages
- statistical machine translation
- document images
- language modeling
- document clustering
- language specific
- natural language
- chinese english
- vector space
- n gram
- document collections
- keywords
- natural language processing
- language model
- query terms
- automatic summarization
- word segmentation
- information retrieval