Speaker-Independent Audio-Visual Speech Separation Based on Transformer in Multi-Talker Environments.
Jing WangYiyu LuoWeiming YiXiang XiePublished in: IEICE Trans. Inf. Syst. (2022)
Keyphrases
- audio visual
- digit recognition
- sound source
- speaker independent
- multi modal
- speech recognition
- visual information
- emotion recognition
- speaker dependent
- visual data
- multi stream
- multimedia
- speech signal
- hidden markov models
- audio features
- speaker verification
- speech recognizer
- audio visual speech recognition
- speaker identification
- acoustic models
- speaker adaptation
- visual content
- face recognition