The Microsoft Speech Language Translation (MSLT) Corpus for Chinese and Japanese: Conversational Test data for Machine Translation and Speech Recognition.
Christian FedermannWilliam D. LewisPublished in: MTSummit (1) (2017)
Keyphrases
- speech recognition
- test data
- machine translation
- statistical machine translation
- phrase based smt
- target language
- chinese english
- mono lingual
- machine translation system
- parallel corpus
- language model
- source language
- natural language
- speech synthesis
- automatic speech recognition
- language processing
- conversational speech
- speech signal
- cross lingual
- test set
- language resources
- isolated word
- parallel corpora
- cross language information retrieval
- comparable corpora
- foreign language
- test cases
- word alignment
- speaker identification
- training data
- hidden markov models
- bilingual dictionaries
- language independent
- information extraction
- query translation
- monolingual retrieval
- speech recognition systems
- spoken language
- noisy environments
- pattern recognition
- translation model
- natural language processing
- spontaneous speech
- word level
- language modeling
- n gram
- information retrieval
- training set
- word sense disambiguation
- multiword
- word segmentation
- word pairs
- data mining
- probabilistic model
- text to speech
- speech retrieval
- feature selection