Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding.
Tian-Hao ZhangHaibo QinZhi-Hao LaiSong-Lu ChenQi LiuFeng ChenXinyuan QianXu-Cheng YinPublished in: CoRR (2023)
Keyphrases
- speech recognition
- speech recognition systems
- cooperative
- speaker independent
- speech recognizers
- hidden markov models
- acoustic models
- speech recognizer
- automatic speech recognition
- speech processing
- noisy speech
- speech signal
- multi modal
- pattern recognition
- speech synthesis
- language model
- audio visual speech recognition
- speech recognition technology
- multi stream
- noisy environments
- natural language
- machine learning
- acoustic features
- speaker identification
- semantic knowledge
- semantic search
- speaker adaptation
- semantic information
- feature selection
- visual speech
- image processing