Time-Domain Audio-Visual Speech Separation on Low Quality Videos.
Yifei WuChenda LiJinfeng BaiZhongqin WuYanmin QianPublished in: ICASSP (2022)
Keyphrases
- audio visual
- low quality
- video summarization
- audio features
- high quality
- multi modal
- visual data
- sound source
- visual information
- video sequences
- multi stream
- frequency domain
- multimedia
- emotion recognition
- audio visual speech recognition
- video frames
- human activities
- video data
- palmprint
- speaker identification
- multiscale