Cross-Modal Parallel Training for Improving end-to-end Accented Speech Recognition.
Renchang DongYijie LiDongxing XuYanhua LongPublished in: ICASSP (2024)
Keyphrases
- end to end
- speech recognition
- cross modal
- multi modal
- isolated word
- hidden markov models
- language model
- speech signal
- pattern recognition
- acoustic models
- automatic speech recognition
- visual recognition
- speech recognizer
- multimedia retrieval
- speech recognition systems
- training set
- multimedia databases
- machine learning
- n gram
- visual data
- speaker independent
- visual similarity
- high level