An Empirical Study of Visual Features for DNN Based Audio-Visual Speech Enhancement in Multi-Talker Environments.
Shrishti Saha ShetuSoumitro ChakrabartyEmanuël Anco Peter HabetsPublished in: ICASSP (2021)
Keyphrases
- audio visual
- visual features
- visual information
- visual data
- speech enhancement
- visual content
- image classification
- multi modal
- signal to noise ratio
- sound source
- noisy environments
- audio features
- low level
- image retrieval
- image annotation
- noise reduction
- low level features
- image collections
- speech signal
- speaker verification
- key frames
- image processing
- contextual information
- co occurrence
- probabilistic model
- multimedia