Massively Multilingual Document Alignment with Cross-lingual Sentence-Mover's Distance.
Ahmed El-KishkyFrancisco GuzmánPublished in: CoRR (2020)
Keyphrases
- cross lingual
- word level
- language independent
- machine translation
- parallel corpus
- word alignment
- source language
- document clustering
- cross lingual information retrieval
- cross language
- multi lingual
- sentiment classification
- language modeling
- sentence level
- word sense
- document images
- language specific
- text classification
- monolingual and cross lingual
- text documents
- document collections
- web news
- natural language
- transfer learning
- distance measure
- natural language processing
- target language
- translation model
- statistical machine translation
- chinese english
- query translation
- tf idf
- text summarization
- cross language information retrieval
- feature selection
- document retrieval
- retrieval systems
- language model
- information extraction
- news articles
- web documents
- indian languages
- information retrieval systems
- keywords