Clustering Comparable Corpora of Russian and Ukrainian Academic Texts: Word Embeddings and Semantic Fingerprints.
Andrey KutuzovMikhail KopotevTatyana SviridenkoLyubov IvanovaPublished in: CoRR (2016)
Keyphrases
- word pairs
- comparable corpora
- terminology extraction
- natural language text
- parallel corpora
- bilingual lexicon
- clustering algorithm
- semantic similarity
- clustering method
- document clustering
- semantic relations
- cross language information retrieval
- semantic network
- text corpora
- bilingual dictionaries
- k means
- natural language
- keywords
- semantic information
- translation model
- co occurrence
- parallel corpus
- news articles
- low dimensional
- language model
- data points