RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues.
Tianrui PanJie LiuBohan WangJie TangGangshan WuPublished in: CoRR (2024)
Keyphrases
- visual cues
- visual speech
- visual information
- noisy environments
- hidden markov models
- speaker identification
- low level
- audio visual
- visual speech recognition
- audio visual speech recognition
- acoustic features
- multiple cues
- multimedia
- speech recognition
- speaker verification
- visual features
- video signals
- audio signal
- audio signals
- pattern recognition
- speech signal
- image processing
- broadcast news
- information retrieval
- multiple visual cues
- gaussian mixture model
- multi modal
- feature extraction
- high level
- search engine