Audio-Visual Speech Enhancement based on Multimodal Deep Convolutional Neural Network.
Jen-Cheng HouSyu-Siang WangYing-Hui LaiJen-Chun LinYu TsaoHsiu-Wen ChangHsin-Min WangPublished in: CoRR (2017)
Keyphrases
- audio visual
- convolutional neural network
- speech enhancement
- noisy environments
- multi modal
- noise reduction
- signal to noise ratio
- face detection
- single channel
- speech signal
- multi stream
- multimodal fusion
- visual information
- linear prediction
- visual data
- vocal tract
- smoothing algorithm
- multimedia
- sound source
- multi channel
- neural network
- wiener filter
- edge detection
- audio features
- speech recognition
- high dimensional
- denoising
- pattern recognition
- computer vision