Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues.
Tassadaq HussainKia DashtipourYu TsaoAmir HussainPublished in: CoRR (2024)
Keyphrases
- audio visual
- speech enhancement
- noisy environments
- emotion recognition
- multi modal
- noise reduction
- speaker verification
- speech recognition
- visual information
- single channel
- signal to noise ratio
- speech signal
- background noise
- linear prediction
- visual data
- multimedia
- temporal context
- speaker identification
- audio features
- probabilistic model
- sound source
- automatic speech recognition
- wiener filter
- multiscale
- edge detection
- machine learning
- low level
- contextual information
- vocal tract
- multi channel