A Wikipedia-based Corpus for Contextualized Machine Translation.
Jennifer DrexlerPushpendre RastogiJacqueline AguilarBenjamin Van DurmeMatt PostPublished in: LREC (2014)
Keyphrases
- machine translation
- statistical machine translation
- parallel corpora
- chinese english
- parallel corpus
- wikipedia articles
- machine translation system
- information extraction
- natural language processing
- natural language text
- pos tagging
- cross language information retrieval
- cross lingual
- wordnet
- natural language
- language independent
- target language
- word sense disambiguation
- comparable corpora
- language resources
- training corpus
- named entities
- knowledge base
- semi automatically
- document collections
- document level
- semantic relatedness
- text corpora
- source language
- word level
- word sense
- link structure
- text mining