Detecting Cross-Lingual Plagiarism Using Simulated Word Embeddings.
Victor U. ThompsonChris BowermanPublished in: CoRR (2017)
Keyphrases
- cross lingual
- translation model
- parallel corpus
- word sense
- word segmentation
- language specific
- machine translation
- word alignment
- language modeling
- language independent
- cross language
- statistical machine translation
- indian languages
- cross lingual information retrieval
- out of vocabulary
- n gram
- machine translation system
- event extraction
- source language
- sentiment classification
- word sense disambiguation
- text classification
- low dimensional
- parallel corpora
- news articles
- vector space
- dimensionality reduction
- co occurrence
- bilingual dictionaries
- machine learning
- data mining
- high dimensional
- k means
- knowledge discovery
- natural language processing
- document clustering
- target language
- query translation