An Efficient and Noise-Robust Audiovisual Encoder for Audiovisual Speech Recognition.
Zhengyang LiChenwei LiangTimo LohrenzMarvin SachBjörn MöllerTim FingscheidtPublished in: INTERSPEECH (2023)
Keyphrases
- speech recognition
- noisy environments
- speech signal
- hidden markov models
- language model
- automatic speech recognition
- pattern recognition
- speech recognition technology
- noisy speech
- video retrieval
- multimedia content
- speech processing
- speech enhancement
- visual information
- speaker identification
- noise reduction
- background noise
- digit recognition
- speech recognizer
- speech synthesis
- additive noise
- speech recognition systems
- audio visual
- keyword spotting
- speaker independent
- word recognition
- emotion recognition
- neural network
- signal to noise ratio
- bit rate
- speech retrieval
- signal processing
- face recognition