Building Machine Translation Tools for Patent Language: A Data Generation Strategy at the European Patent Office.
Matthias WirthVolker D. HähnkeFranco MasciaArnaud WéryKonrad VowinckelMarco del ReyRaúl Mohedano del PozoPau MontesAlexander Klenner-BajajaPublished in: EAMT (2023)
Keyphrases
- word alignment
- machine translation
- data generation
- target language
- parallel corpus
- language resources
- machine translation system
- language processing
- statistical machine translation
- natural language
- cross lingual
- source language
- language independent
- information retrieval
- natural language processing
- information extraction
- multilingual documents
- cross language information retrieval
- natural language generation
- high throughput
- parallel corpora
- active learning
- expert systems
- data streams
- linguistic resources
- chinese english
- query translation
- text classification
- knowledge representation
- comparable corpora