Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates.
Javier Iranzo-SánchezJoan Albert Silvestre-CerdàJavier JorgeNahuel RosellóAdrià GiménezAlbert SanchísJorge CiveraAlfons JuanPublished in: ICASSP (2020)
Keyphrases
- parallel corpus
- machine translation system
- comparable corpora
- cross language information retrieval
- language resources
- chinese english
- parallel corpora
- broadcast news
- statistical machine translation
- machine translation
- cross lingual
- multi lingual
- query translation
- cross language
- spontaneous speech
- cross lingual information retrieval
- language independent
- conversational speech
- speech recognition
- sentence pairs
- lexical features
- automatic speech recognition
- cross language ir
- out of vocabulary
- lexical knowledge
- word alignment
- text corpora
- bilingual dictionaries
- digital libraries
- speaker identification
- spoken language
- english words
- text to speech
- information access
- bilingual lexicon
- news articles
- recognition engine
- language modeling
- word pairs
- audio visual
- target language
- spoken document retrieval
- dialogue system
- finite state transducers
- speech synthesis
- information extraction
- linguistic resources
- spanish language
- manually annotated