A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation.
Linh The NguyenNguyen Luong TranLong DoanManh LuongDat Quoc NguyenPublished in: INTERSPEECH (2022)
Keyphrases
- high quality
- machine translation
- text to speech
- english text
- query translation
- cross language information retrieval
- broadcast news
- spoken language
- statistical machine translation
- machine translation system
- target language
- cross language
- speech recognition
- language resources
- cross language retrieval
- source language
- low quality
- benchmark datasets
- image quality
- speech recognition technology
- parallel corpus
- chinese english
- word level
- pronominal anaphora
- answer questions
- small scale
- speech synthesis
- natural language
- english chinese
- english words
- real life
- english language
- speech signal
- automatic speech recognition
- cross lingual
- bilingual dictionaries
- parallel corpora
- finite state transducers
- chinese web
- real world
- out of vocabulary
- comparable corpora
- synthetic datasets
- text retrieval
- web scale