Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement.
Mostafa SadeghiXavier Alameda-PinedaPublished in: CoRR (2019)
Keyphrases
- audio visual
- speech enhancement
- multi modal
- visual information
- noise reduction
- signal to noise ratio
- noisy environments
- multi stream
- visual data
- speech signal
- linear prediction
- multimedia
- bayesian networks
- audio features
- expectation maximization
- single channel
- multiscale
- information retrieval
- image segmentation
- feature extraction
- background noise
- image coding
- gaussian mixture model
- speech recognition