DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation.
Rachel BawdenSophie RossetThomas LavergneÉric BilinskiPublished in: CoRR (2019)
Keyphrases
- machine translation
- statistical machine translation
- chinese english
- parallel corpora
- parallel corpus
- machine translation system
- conversational speech
- cross lingual
- spontaneous speech
- cross language information retrieval
- word alignment
- natural language processing
- comparable corpora
- language independent
- natural language
- target language
- pos tagging
- language processing
- information extraction
- english chinese
- dialogue system
- word sense disambiguation
- language resources
- cross lingual information retrieval
- statistical translation models
- query translation
- cross language
- word level
- source language
- finite state transducers
- machine readable dictionaries
- wordnet