Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates.
Javier Iranzo-SánchezJoan Albert Silvestre-CerdàJavier JorgeNahuel RosellóAdrià GiménezAlbert SanchísJorge CiveraAlfons JuanPublished in: CoRR (2019)
Keyphrases
- parallel corpus
- machine translation system
- cross language information retrieval
- language resources
- comparable corpora
- chinese english
- broadcast news
- statistical machine translation
- cross lingual
- machine translation
- multi lingual
- parallel corpora
- cross language
- spontaneous speech
- cross lingual information retrieval
- conversational speech
- language independent
- query translation
- speech recognition
- spoken document retrieval
- word alignment
- out of vocabulary
- sentence pairs
- digital libraries
- cross language ir
- lexical knowledge
- speech signal
- spoken language
- speaker identification
- lexical features
- automatic speech recognition
- linguistic resources
- word pairs
- bilingual dictionaries
- finite state transducers
- recognition engine
- english words
- audio visual
- target language
- information extraction
- speech processing
- text to speech
- language specific
- training corpus
- manually annotated
- text corpora
- speech synthesis
- information access
- human machine interaction
- news articles
- question answering
- wordnet
- spanish language
- hearing impaired