Measuring the Relatedness between Documents in Comparable Corpora.
Hernani CostaGloria Corpas PastorRuslan MitkovPublished in: TIA (2015)
Keyphrases
- comparable corpora
- parallel corpora
- cross language information retrieval
- bilingual lexicon
- word pairs
- text documents
- machine translation
- text corpora
- news articles
- query terms
- language modeling
- language independent
- bilingual dictionaries
- information retrieval
- cross lingual
- relevant documents
- document collections
- semantic similarity
- linguistic resources
- query translation
- information retrieval systems
- cross language
- document retrieval
- text analysis
- labor intensive
- text clustering
- web documents
- co occurrence
- feature selection
- keywords
- sentence level
- document clustering