On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement.
Daniel MichelsantiZheng-Hua TanSigurdur SigurdssonJesper JensenPublished in: CoRR (2018)
Keyphrases
- audio visual
- deep learning
- speech enhancement
- multi modal
- visual information
- unsupervised learning
- visual data
- noisy environments
- noise reduction
- machine learning
- multimedia
- higher order
- training set
- supervised learning
- data sets
- pattern recognition
- feature extraction
- signal to noise ratio
- mental models
- weakly supervised
- image processing