The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages
Ralf SteinbergerBruno PouliquenAnna WidigerCamelia IgnatTomaz ErjavecDan TufisDániel VargaPublished in: CoRR (2006)
Keyphrases
- parallel corpus
- cross lingual
- language independent
- cross lingual information retrieval
- target language
- machine translation system
- machine translation
- query translation
- sentence pairs
- statistical machine translation
- lexical knowledge
- text classification
- cross language
- cross language information retrieval
- source language
- language modeling
- parallel corpora
- word alignment
- document classification
- bilingual dictionaries
- document clustering
- comparable corpora
- linguistic resources
- news articles
- feature selection
- information retrieval
- document collections
- text mining
- digital libraries
- search engine
- artificial intelligence