Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC.
Ralf SteinbergerBruno PouliquenJohan HagmanPublished in: CICLing (2002)
Keyphrases
- cross lingual
- similarity calculation
- document clustering
- information retrieval systems
- information retrieval
- document collections
- language modeling
- cross lingual information retrieval
- similarity measure
- machine translation
- similarity scores
- language independent
- cross language
- document representation
- keywords
- document retrieval
- text documents
- language model
- retrieval systems
- query expansion
- digital libraries
- text classification
- news articles
- natural language processing
- music retrieval
- vector space model
- source language
- retrieval model
- relevant documents
- web documents
- parallel corpus
- query translation
- clustering algorithm
- text mining
- text categorization
- feature space
- cross language information retrieval
- information access
- retrieval effectiveness
- natural language
- machine learning