ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement.
Wei-Ning HsuTal RemezBowen ShiJacob DonleyYossi AdiPublished in: CoRR (2022)
Keyphrases
- speech enhancement
- visual input
- noisy environments
- speech signal
- noise reduction
- signal to noise ratio
- vocal tract
- single channel
- vision system
- visual information
- linear prediction
- multi channel
- speech recognition
- visual attention
- wiener filter
- speech synthesis
- visual perception
- background noise
- visual field
- visual features
- hidden markov models
- edge detection
- human computer interaction
- non stationary
- visual data
- ego motion
- information retrieval
- multiscale
- computer vision