Reverberant speech separation based on audio-visual dictionary learning and binaural cues.
Qingju LiuWenwu WangPhilip J. B. JacksonMark BarnardPublished in: SSP (2012)
Keyphrases
- dictionary learning
- audio visual
- sound source
- speech signal
- multimodal fusion
- sparse representation
- multi modal
- sparse coding
- speech recognition
- visual information
- audio features
- visual data
- multi stream
- speaker verification
- emotion recognition
- pattern recognition
- multimedia
- image patches
- image classification
- high level
- machine learning
- video data
- natural images
- low level
- audio visual speech recognition