Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings.
Mikel ArtetxeHolger SchwenkPublished in: ACL (1) (2019)
Keyphrases
- parallel corpus
- cross lingual
- cross language information retrieval
- sentence pairs
- language independent
- word alignment
- query translation
- machine translation system
- cross lingual information retrieval
- machine translation
- statistical machine translation
- knowledge discovery
- data mining
- low dimensional
- cross language
- text categorization
- dimensionality reduction
- text mining
- digital libraries